lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uri Boness (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
Date Wed, 27 Jan 2010 18:55:34 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805612#action_12805612
] 

Uri Boness commented on SOLR-1725:
----------------------------------

{quote}
Performance:

It looks like scripts are read from the resource loader and parsed again (eval) for every
update request. This can be pretty expensive, esp for those scripting languages that generate
java class files instead of using an interpreter. One way to combat this would be to cache
and reuse them.
{quote}
Yes, indeed the scripts are evaluated per request but for a reason. One of the goals here
is to keep the scripts as close as possible to the update processor interface, so the functions
in the scripts has the same signature as the methods in the processor. But in order for the
scripts to be flexible I decided to introduce some global scoped variables which are accessible
in the functions. (currently the current solr request, response and a logger are there). The
problem is that the API only defines 3 scopes where you can register variables and the lowest
one is the engine itself. Since the evaluation of a script is done on the engine level as
well, when using this API together with the global variables I don't think you can escape
the need for creating an engine per request (thus, also evaluating the scripts).

But I agree with you that if there is a way around it, caching the evaluated/compiled scripts
will definitely boost things up. I'll need to investigate this further and come up with alternatives
(I already have some ideas using ThreadLocals).

bq. Should we have a way to specify a script in-line (in solrconfig.xml)?

Personally I prefer keeping the solrconfig.xml as clean as possible. I do however think that
a standardization of Solr scripting support in general can be great. (for example, have a
scripts folder under _solr.solr.home_ were all the scripts are placed, or come up with a standard
configuration structure for the scripts... perhaps something in the direction Hoss suggested
above).

bq. This seems to raise the visibility of the UpdateCommand classes, directly exposing them
to users w/o plugins. We should perhaps consider interface cleanups on these classes at the
same time as this issue.
+1

bq. Examples! Using javascript (since it's both fast and included in JDK6), let's see what
the scripts are for some common usecases. This both helps improve the design as well as lets
other people give feedback w/o having to read through code.
Yep.. that would probably be very helpful. basically I think anyone who's ever written an
update processor can perhaps try to convert it to a script and see how it works. The usual
use case for me is to just add a few fields which are derived from the other fields, but perhaps
there are some other more interesting use cases out there. I guess these examples should be
put in the Wiki, right?





> Script based UpdateRequestProcessorFactory
> ------------------------------------------
>
>                 Key: SOLR-1725
>                 URL: https://issues.apache.org/jira/browse/SOLR-1725
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.4
>            Reporter: Uri Boness
>         Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch,
SOLR-1725.patch
>
>
> A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main
goal of this plugin is to be able to configure/write update processors without the need to
write and package Java code.
> The update request processor factory enables writing update processors in scripts located
in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter
named {{scripts}} which accepts a comma-separated list of file names. It will look for these
files under the {{conf}} directory in solr home. When multiple scripts are defined, their
execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}}
will be executed before {{scriptB.js}}).
> The script language is resolved based on the script file extension (that is, a *.js files
will be treated as a JavaScript script), therefore an extension is mandatory.
> Each script file is expected to have one or more methods with the same signature as the
methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods,
only those hat are required by the processing logic.
> The following variables are define as global variables for each script:
>  * {{req}} - The SolrQueryRequest
>  * {{rsp}}- The SolrQueryResponse
>  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message