Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 9505 invoked from network); 14 Feb 2011 15:04:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Feb 2011 15:04:54 -0000 Received: (qmail 32582 invoked by uid 500); 14 Feb 2011 15:04:52 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 32492 invoked by uid 500); 14 Feb 2011 15:04:50 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 32485 invoked by uid 99); 14 Feb 2011 15:04:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Feb 2011 15:04:49 +0000 X-ASF-Spam-Status: No, hits=2.1 required=5.0 tests=FREEMAIL_FROM,HK_RANDOM_ENVFROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of alxcwll@gmail.com designates 209.85.213.48 as permitted sender) Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com) (209.85.213.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Feb 2011 15:04:43 +0000 Received: by ywc21 with SMTP id 21so2319266ywc.35 for ; Mon, 14 Feb 2011 07:04:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=lEAfyZ0lmpT15ccWFQIAWNn85xD0G/eYY9Ti/NCcctI=; b=vNjyV6+/YjLJePICWvxzPy0mHx9h69a3LiHlO6tliVAHsJQGrMpWlwucNs0JehQbaP lGGnWev9XXwzLUlxkgUNGw2PvMOWlBX19+qnL4aWXWLfnbvvCZyT/RU1EWKu/jESlnn+ tNflCy/fiHE9Ebu6s3555OSO13PpvgiHf/9vY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=jEvEZ2nebxA1FEPIRnnOFHW7gF39QgIDWS3nhkZyDZfKW90PHinv/1ZCs00iH1OAe1 JFowPjkyeQNVvg12NkxDaBiN/zV433sudi2qN804zdFb0qSDlLj/x0nLdqWJhd1X3c7G sK/sS0tG06CLKtiGdm76dzye+4zDnxRoEcYn8= MIME-Version: 1.0 Received: by 10.42.225.129 with SMTP id is1mr3068927icb.429.1297695861656; Mon, 14 Feb 2011 07:04:21 -0800 (PST) Received: by 10.42.223.196 with HTTP; Mon, 14 Feb 2011 07:04:21 -0800 (PST) In-Reply-To: References: <1296059354.3260.19.camel@soheb-1201N> <1296333246.2931.12.camel@soheb-1201N> <1296345418.18365.1417931133@webmail.messagingengine.com> <1296559676.20400.1418327987@webmail.messagingengine.com> <1296592736.25523.1418427511@webmail.messagingengine.com> <1296635467.5878.1418514861@webmail.messagingengine.com> <1296752436.28572.1418784405@webmail.messagingengine.com> <1297086938.29667.1419359807@webmail.messagingengine.com> Date: Mon, 14 Feb 2011 15:04:21 +0000 Message-ID: Subject: Re: Distributed Indexing From: Alex Cowell To: dev@lucene.apache.org Content-Type: multipart/alternative; boundary=20cf3054a15ddc533a049c3f5c74 --20cf3054a15ddc533a049c3f5c74 Content-Type: text/plain; charset=ISO-8859-1 I've uploaded a patch of what we've done so far: https://issues.apache.org/jira/browse/SOLR-2358 It's still very much work in progress and there are some obvious issues which are being resolved at the moment (such as the inefficient method of waiting for all the docs to be processed before distributing them in one batch and handling shard replicas), but any feedback is welcomed. As it stands, you can distribute add and commit requests using the HashedDistributionPolicy by simply specifying a 'shards' request parameter. Using a user specified distribution policy (either as a param in the URL or defined in the solrconfig as Upayavira suggested) is in the works as well. Regarding that, I figure the priority for determining which policy to use would be (highest to lowest): 1. Param in the URL 2. Specified in the solrconfig 3. Hard-coded default to fall back on That way if a user changed their mind about which distribution policy they wanted to use, they could override the default policy with their chosen one as a request parameter. The code has only been acceptance tested at the moment. There is a test class but it's a bit messy, so once that's tidied up and improved a little more I'll include it in the next patch. > I haven't had time to follow all of this discussion, but this issue might > help: > https://issues.apache.org/jira/browse/SOLR-2355 > Thanks - very interesting! It's reassuring to see our implementation has been following a similar structure. There seem to be some nuances which we have yet to encounter/discover like the way you've implemented the processCommit() method to wait for all the adds/deletes to complete before sending the commits. Are these things which you were aware of in advance that would need to be dealt with? Alex --20cf3054a15ddc533a049c3f5c74 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I've uploaded a patch of what we've done so far:

https://issues.apache.org/= jira/browse/SOLR-2358

It's still very much work in progress = and there are some obvious issues which are being resolved at the moment (s= uch as the inefficient method of waiting for all the docs to be processed b= efore distributing them in one batch and handling shard replicas), but any = feedback is welcomed.

As it stands, you can distribute add and commit requests using the Hash= edDistributionPolicy by simply specifying a 'shards' request parame= ter. Using a user specified distribution policy (either as a param in the U= RL or defined in the solrconfig as Upayavira suggested) is in the works as = well. Regarding that, I figure the priority for determining which policy to= use would be (highest to lowest):

1. Param in the URL
2. Specified in the solrconfig
3. Hard-coded = default to fall back on

That way if a user changed their mind about = which distribution policy they wanted to use, they could override the defau= lt policy with their chosen one as a request parameter.

The code has only been acceptance tested at the moment. There is a test= class but it's a bit messy, so once that's tidied up and improved = a little more I'll include it in the next patch.
=A0
I haven't had time to follow all of this discussion, but this issue mig= ht help:
https://issues.apache.org/jira/browse/SOLR-2355

Thanks - very interesting! It's reassuring to see our implemen= tation has been following a similar structure.

There seem to be some= nuances which we have yet to encounter/discover like the way you've im= plemented the processCommit() method to wait for all the adds/deletes to co= mplete before sending the commits. Are these things which you were aware of= in advance that would need to be dealt with?

Alex
--20cf3054a15ddc533a049c3f5c74--