lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Mayor <>
Subject Re: Distributed Indexing
Date Sun, 06 Feb 2011 21:57:47 GMT

Good call about the policies being deterministic, should've thought of that

We've changed the patch to include this and I've removed the random
assignment one (for obvious reasons).

Take a look and let me know what's to do. (



On Thu, Feb 3, 2011 at 5:00 PM, Upayavira <> wrote:

>  On Thu, 03 Feb 2011 15:12 +0000, "Alex Cowell" <> wrote:
> Hi all,
> Just a couple of questions that have arisen.
> 1. For handling non-distributed update requests (shards param is not
> present or is invalid), our code currently
>    - assumes the user would like the data indexed, so gets the request
>    handler assigned to "/update"
>    - executes the request using core.execute() for the SolrCore associated
>    with the original request
> Is this what we want it to do and is using core.execute() from within a
> request handler a valid method of passing on the update request?
> Take a look at how it is done in
> handler.component.SearchHandler.handleRequestBody(). I'd say try to follow
> as similar approach as possible. E.g. it is the SearchHandler that does much
> of the work, branching depending on whether it found a shards parameter.
> 2. We have partially implemented an update processor which actually
> generates and sends the split update requests to each specified shard (as
> designated by the policy). As it stands, the code shares a lot in common
> with the HttpCommComponent class used for distributed search. Should we look
> at "opening up" the HttpCommComponent class so it could be used by our
> request handler as well or should we continue with our current
> implementation and worry about that later?
> I agree that you are going to want to implement an UpdateRequestProcessor.
> However, it would seem to me that, unlike search, you're not going to want
> to bother with the existing processor and associated component chain, you're
> going to want to replace the processor with a distributed version.
> As to the HttpCommComponent, I'd suggest you make your own educated
> decision. How similar is the class? Could one serve both needs effectively?
> 3. Our update processor uses a MultiThreadedHttpConnectionManager to send
> parallel updates to shards, can anyone give some appropriate values to be
> used for the defaultMaxConnectionsPerHost and maxTotalConnections params?
> Won't the  values used for distributed search be a little high for
> distributed indexing?
> You are right, these will likely be lower for distributed indexing, however
> I'd suggest not worrying about it for now, as it is easy to tweak later.
> Upayavira
>  ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source

View raw message