lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl (JIRA) <j...@apache.org>
Subject [jira] Commented: (SOLR-1093) A RequestHandler to run multiple queries in a batch
Date Thu, 21 Jan 2010 16:49:54 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803356#action_12803356
] 

Jan Høydahl commented on SOLR-1093:
-----------------------------------

Parallel execution of multiple queries is just one use case in a family of many others, and
I agree with Lance's post in the list that it would be better to make an extensible component.

Other similar use cases often requested: multi source federation, factor in ad service, select
sources based on query analysis, select sources based on results, non-solr sources, result
modification based on content in result, query abstraction layer/templating

The common goal is to make an abstraction layer on top of search sources which can handle
search-close functionality and thus not need implement this in all the front-ends. Other products
which try to fill this role are: FAST Unity, Comperio Front, Sesat (sesat.no)

Perhaps the /multi req.handler could be the start of such a framework, where the first plugin
to implement is the parallel queries use-case.

To be able to handle a high count for "n" without hitting HTTP GET limitaions, and get a cleaner
syntax for complex cases, the handler could accept the request as a POST. Pseudo post content,
could be JSON or custom:
<steps>
  <branch type="list">
    <src name="web">qt=dismax&q=$q&amp;rows=10&amp;facet=true&amp;facet.fl=mimetype</src>
    <src name="google">q=$q</src>
    <src name="yp">q=category:$q^10 OR company:$q&amp;rows=3</src>
    <src name="wp">q=$q&amp;rows=3</src>
    <src name="ads">q=$q</src>
  </multi>
</steps>

The result list would then consist of five entries named web, yp, google, wp and ads.
Each "branch" and "src" would be pre-defined in config, specifying the implementing class
and any defaults. indeed, the whole POST could be pre-configured, only needing to supply a
&steps= param to identify which "template" to choose, using $variables for q etc.
The class implmenting "steps" simply calls each sub step in sequence, passing the request
and response objects. This provides a simple framework for future extensions, pre- or post-processing.
The class implementing "branch" of type "list" would spawn all sub queries as threads and
include each source result in a list.
Another implementation type of "branch" could merge (federate) results instead of stacking
them.
The class implementing a "src" would be a thin wrapper which simply dispatches the query to
the Search RequestHandler. Other implementations of "src" could be wrappers for external engines
like Google or ad servers.

My intention is not to suggest a huge component, but consider if a smart interface design
could enable very powerful  extension possibility which will be useful in almost all portal
type applications.

> A RequestHandler to run multiple queries in a batch
> ---------------------------------------------------
>
>                 Key: SOLR-1093
>                 URL: https://issues.apache.org/jira/browse/SOLR-1093
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Noble Paul
>             Fix For: 1.5
>
>
> It is a common requirement that a single page requires to fire multiple queries .In cases
where these queries are independent of each other. If there is a handler which can take in
multiple queries , run them in paralll and send the response as one big chunk it would be
useful
> Let us say the handler is  MultiRequestHandler
> {code}
> <requestHandler name="/multi" class="solr.MultiRequestHandler"/>
> {code}
> h2.Query Syntax
> The request must specify the no:of queries as count=n
> Each request parameter must be prefixed with a number which denotes the query index.optionally
,it may can also specify the handler name.
> example
> {code}
> /multi?count=2&1.handler=/select&1.q=a:b&2.handler=/select&2.q=a:c
> {code}
> default handler can be '/select' so the equivalent can be
> {code} 
> /multi?count=2&1.q=a:b&2.q=a:c
> {code}
> h2.The response
> The response will be a List<NamedList> where each NamedList will be a response
to a query. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message