lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Ernst (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SOLR-4643) Refactor shard handler (and factory) to make pieces more pluggable
Date Mon, 25 Mar 2013 21:23:16 GMT
Ryan Ernst created SOLR-4643:
--------------------------------

             Summary: Refactor shard handler (and factory) to make pieces more pluggable
                 Key: SOLR-4643
                 URL: https://issues.apache.org/jira/browse/SOLR-4643
             Project: Solr
          Issue Type: Improvement
            Reporter: Ryan Ernst


Over the past few weeks I've been trying to write my own shard handler/factory, and it is
a bit of a pain.  The pieces that I don't want to reimplement are tied very closely with those
that I do.

I believe the current design is as follows:

ShardHandlerFactory - created once, shared across cores (except in some legacy case where
it is per core?).  This contains the "heavyweight" stuff like threadpool for parallelizing
requests and httpclient.  It also is what keeps a solrj loadbalancer object.

ShardHandler - created per request, it has the logic for determining if a request is distributed,
and sending the requests in parallel (using an executor from the parent factory object). 
It also has the knowledge of how to send requests and parse the response embedded within the
parallelization piece (through solrj code).

I've attempted to address some of the ease of plug-ability:
https://issues.apache.org/jira/browse/SOLR-4544
This was an attempt to get to reuse the code for parallelizing the requests, but still plug
in code for making the requests.  It sort of works, but was just a stop gap measure.  You
still cannot format the request or parse the response without reimplementing ShardHandler.

https://issues.apache.org/jira/browse/SOLR-4613
Here I was trying to only require creating a shard handler when the request is distributed,
instead of every request just to find out if it is distributed.

At this point I thought I would create a jira to write down a proposal for how to do this
refactoring, instead of continuing with piecemeal/out of context jiras.


I view this shard handler business as needing the following:
1. Something to parallelize the requests.  Most people should never have to replace this (if
anyone?).  It contains the thread pool and execution service and is global (like the shard
handler factory now).

2. Something that knows how to talk to the shards.  This includes formatting the request and
parsing the response. This could probably be per core or even per request handler?

3. Something to do load balancing.  This could probably be in 2, although I could see it being
separate for easier plugging of LB without having to handle request/response format or vice
versa.  It would contain the http client for talking to hosts, and so probably still be global.

I would love to get consensus on the design of this before going off and doing it, and suggestions
for how to break this into smaller pieces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message