lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-1725) Script based UpdateRequestProcessorFactory
Date Wed, 04 Jul 2012 02:34:34 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406247#comment-13406247
] 

Hoss Man commented on SOLR-1725:
--------------------------------

bq. I think this just demands that we implement an add-only type of capability such that the
entire script is implicitly inside a processAdd call.
...
bq. Perhaps the configuration in solrconfig for this processor would have a processAddScript="myscript.rb"
and a processDeleteScript= ... etc. as an alternative to script="script.rb"

That's all fine and good and i have no objection to any of it -- but as i tried to explain
before those aternative ideas still have a raft of questions related to what the lifecyle
of the scripts should be, what the bindings should be for the relevant objects (SolrQueryRequest,
AddDocCmd, etc...) and how they should be evaluated (CompiledScript vs script that is evaled
on every processAdd, etc...).  

Hence my point that i think we should commit this as "StatelessScriptUpdateProcessorFactory",
where the cript processing mirrors the lifecylce of a native java "UpdateProcessor" and iterate
with other approaches, using other factories, in other jira issues -- if we can refactor common
stuff out then great, but we shouldn't try to over think generalizing the internals of this
implementation in anticipation of a hypothetical future class that will likely be just as
easy to write independently.

bq. Do we really need to support multiple scripts inside a definition of a (Stateless)ScriptUpdateProcessorFactory?
It just seems added looping when why not just define two different StatelessScriptUpdateProcessorFactory's
each with an individual script? (or, combine the logic of the scripts into a single script
if these were my scripts)

good question.  That looping code was already there when i started looking at the patch --
i left it in mainly because:

* it was already written and didn't overly complicate things
* it seemed like it would probably be easier/simpler for a lot of users to just add a {{<str
name="script">foo.js</str>}} when they wanted to add a script then to add an entire
new {{<processor>...</processor>}}
* we use a single ScriptEngineManager per request, per UpdatePocessor instance.  _In theory_
it will be more efficient for some languages the to generate ScriptEngines for each script
from the same ScriptEngineManager then from distinct ScriptEngineManagers (ie: imagine if
your scripting langauge was Java: configuring two scripts in a single {{<processor>}}
means you spin up one JVM per request; if you put each script in it's own {{<processor>}}
you spin up 2 JVMs per request)
* according to the javax.script javadocs, because we use a single ScriptEngineManager per
request then _in theory_ any variable in "global" scope will be common across all the script
files (for that request).  (In my JVM, this doesn't work for multiple javascript scripts that
try to refer to the same global vars, no idea if other javac.script implementations support
it)

bq. What if I have a main file and a library file? How would that work?

No freaking clue. .. 

* The javax.script APIs provide no mechanism for Java code to specify that modules should
be loaded before evaluating a script, or any way to configure where the engine should look
for modules if a script attempts to load them using it's own native syntax
* javascript doesn't even have a language mechanism (that i know of) for a script file to
specify that it wants to "import" another file/script so i don't even know of a decent way
to test what happens if you try in a lanaguge that does ... (ie: will it try to use the ScriptEngineManager's
classloader? will it try to read from the system default path for that language's libs?)




                
> Script based UpdateRequestProcessorFactory
> ------------------------------------------
>
>                 Key: SOLR-1725
>                 URL: https://issues.apache.org/jira/browse/SOLR-1725
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.4
>            Reporter: Uri Boness
>            Assignee: Erik Hatcher
>              Labels: UpdateProcessor
>             Fix For: 4.1
>
>         Attachments: SOLR-1725-rev1.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch,
SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch,
SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch
>
>
> A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main
goal of this plugin is to be able to configure/write update processors without the need to
write and package Java code.
> The update request processor factory enables writing update processors in scripts located
in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter
named {{scripts}} which accepts a comma-separated list of file names. It will look for these
files under the {{conf}} directory in solr home. When multiple scripts are defined, their
execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}}
will be executed before {{scriptB.js}}).
> The script language is resolved based on the script file extension (that is, a *.js files
will be treated as a JavaScript script), therefore an extension is mandatory.
> Each script file is expected to have one or more methods with the same signature as the
methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods,
only those hat are required by the processing logic.
> The following variables are define as global variables for each script:
>  * {{req}} - The SolrQueryRequest
>  * {{rsp}}- The SolrQueryResponse
>  * {{logger}} - A logger that can be used for logging purposes in the script

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message