Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: solr-user@lucene.apache.org
Received-SPF: pass (nike.apache.org: domain of
 prvs=6933218E39=Alexey_Kozhemiakin@epam.com designates 217.21.63.34 as
 permitted sender)
From: Alexey Kozhemiakin <Alexey_Kozhemiakin@epam.com>
To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
Subject: RE: Sharding and Replication
Thread-Topic: Sharding and Replication
Thread-Index: 
 AQHObN+QTWAuLvKlf0CaWIYrRqRhiZlAl5eAgABPGYCAANcJgIAA19qAgADQAgCASgDpYA==
Date: Fri, 9 Aug 2013 20:17:24 +0000
Message-ID: <A45240F27737144797496746D3D89CD759F3A59C@EPBYMINSA0032.epam.com>
References: 
 <CALftMhHmx_ZtyKUxtdYFd47nAj7R8Y761VdiY1cvh-kUQsbONg@mail.gmail.com>
	<CAN4YXveuDgsKSF8yxYUuWk6uyL4sk0NBh_VTn3TiahD0473ABQ@mail.gmail.com>
	<CALftMhFo9S=uvigTHGAKuSrL1KU13hfDuTd7v4q9RY=+jy7W1Q@mail.gmail.com>
	<CAN4YXvfxaOmki3oDs38mx=xVnwdb1AbQtKaexi6Yd85aWvgcNg@mail.gmail.com>
	<CALftMhEEKwnWuKzESQD9J5uiMMVp2fxa8Rco0EP5DHge5jJxsA@mail.gmail.com>
 <CAN4YXvcgm_eg_q65YKtL46t_w0_FQEc+nJ218y03-ki8heSbbQ@mail.gmail.com>
In-Reply-To: 
 <CAN4YXvcgm_eg_q65YKtL46t_w0_FQEc+nJ218y03-ki8heSbbQ@mail.gmail.com>
Accept-Language: ru-RU, en-US
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

+1 I'd like to vote for this issue https://issues.apache.org/jira/browse/SO=
LR-4956

It would be useful to have this parameters configurable.

When we index hundreds of millions of documents to 4 shard  SolrCloud in ba=
tches of 20K -  overhead of this chatty conversation with replicas and othe=
r shards is significant, we didn't perform detailed measurements, but incre=
ase of this hardcoded value improved our indexing throughput from 1.2mln up=
 to 3mln docs per minute.

It agree that in general case it is more correct to reduce the value, but i=
t would be nice to control it for specific cases and environments.

Alex
-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]=20
Sent: Sunday, June 23, 2013 20:41
To: solr-user@lucene.apache.org
Subject: Re: Sharding and Replication

Asif:

Thanks, this is great info and may add to the priority of making this confi=
gurable.
I raised a JIRA, see: https://issues.apache.org/jira/browse/SOLR-4956
and feel free to add anything you'd like or correct anything I didn't get r=
ight.

Best
Erick

On Sat, Jun 22, 2013 at 10:16 PM, Asif <tallasif@gmail.com> wrote:
> Erick,
>
> Its a completely practical problem - we are exploring Solr to build a=20
> real time analytics/data solution for a system handling about 1000=20
> qps. We have various metrics that are stored as different collections=20
> on the cloud, which means very high amount of writes. The cloud also=20
> needs to support about 300-400 qps.
>
> We initially tested with a single Solr node on a 16 core / 24 GB box =20
> for a single metric. We saw that writes were not a issue at all - Solr=20
> was handling it extremely well. We were also able to achieve about 200=20
> qps from a single node.
>
> When we set up the cloud ( a ensemble on 6 boxes), we saw very high=20
> CPU usage on the replicas. Up to 10 cores were getting used for writes=20
> on the replicas. Hence my concern with respect to batch updates for the r=
eplicas.
>
> BTW, I altered the maxBufferedAddsPerServer to 1000 - and now CPU=20
> usage is very similar to single node installation.
>
> - Asif
>
>
>
>
>
>
> On Sat, Jun 22, 2013 at 9:53 PM, Erick Erickson <erickerickson@gmail.com>=
wrote:
>
>> Yeah, there's been talk of making this configurable, but there are=20
>> more pressing priorities so far.
>>
>> So just to be clear, is this theoretical or practical? I know of=20
>> several very high-performance situations where 1,000 updates/sec (and=20
>> I'm assuming that it's 1,000 docs/sec not 1,000 batches of 1,000=20
>> docs) hasn't caused problems here. So unless you're actually seeing=20
>> performance problems as opposed to fearing that there _might_ be, I'd=20
>> just go on the to the next urgent problem.
>>
>> Best
>> Erick
>>
>> On Fri, Jun 21, 2013 at 8:34 PM, Asif <tallasif@gmail.com> wrote:
>> > Erick,
>> >
>> > Thanks for your reply.
>> >
>> > You are right about 10 updates being batch up - It was hard to=20
>> > figure out due to large number of updates/logging that happens in our =
system.
>> >
>> > We are batching 1000 updates every time.
>> >
>> > Here is my observation from leader and replica -
>> >
>> > 1. Leader logs are clearly indicating that 1000 updates arrived - [=20
>> > (1000 adds)],commit=3D] 2. On replica - for each 1000 document adds=20
>> > on leader - I see a lot of requests on replica - with no indication=20
>> > of how many updates in each request.
>> >
>> > Digging a little bit into Solr code  I figured this variable I am=20
>> > interested in - maxBufferedAddsPerServer is set to 10 -
>> >
>> >
>> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/
>> apache/solr/update/SolrCmdDistributor.java?view=3Dmarkup
>> >
>> > This means for a batch update of 1000 documents - we will be seeing=20
>> > 100 requests for replica - which translates into 100 writes per=20
>> > collection
>> per
>> > second in our system.
>> >
>> > Should this variable be made configurable via solrconfig.xml (or=20
>> > any
>> other
>> > appropriate place)?
>> >
>> > A little background about a system we are trying to build - real=20
>> > time analytics solution using the Solr Cloud + Atomic updates - we=20
>> > have very high amount of writes - going as high as 1000 updates a=20
>> > second (possibly more in long run).
>> >
>> > - Asif
>> >
>> >
>> >
>> >
>> >
>> > On Sat, Jun 22, 2013 at 4:21 AM, Erick Erickson=20
>> ><erickerickson@gmail.com
>> >wrote:
>> >
>> >> Update are batched, but it's on a per-request basis. So, if you're=20
>> >> sending one document at a time you'll won't get any batching. If=20
>> >> you send 10 docs at a time and they happen to go to 10 different=20
>> >> shards, you'll get 10 different update requests.
>> >>
>> >> If you're sending 1,000 docs per update you' should be seeing some=20
>> >> batching going on.
>> >>
>> >> bq:  but why not batch them up or give a option to batch N updates=20
>> >> in either of the above case
>> >>
>> >> I suspect what you're seeing is that you're not sending very many=20
>> >> docs per update request and so are being mislead.
>> >>
>> >> But that's a guess since you haven't provided much in the way of=20
>> >> data on _how_ you're updating.
>> >>
>> >> bq: the cloud eventually starts to fail How? Details matter.
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Wed, Jun 19, 2013 at 4:23 AM, Asif <tallasif@gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> > I had questions on implementation of Sharding and Replication
>> features of
>> >> > Solr/Cloud.
>> >> >
>> >> > 1. I noticed that when sharding is enabled for a collection -
>> individual
>> >> > requests are sent to each node serving as a shard.
>> >> >
>> >> > 2. Replication too follows above strategy of sending individual
>> documents
>> >> > to the nodes serving as a replica.
>> >> >
>> >> > I am working with a system that requires massive number of=20
>> >> > writes - I
>> >> have
>> >> > noticed that due to above reason - the cloud eventually starts=20
>> >> > to fail (Even though I am using a ensemble).
>> >> >
>> >> > I do understand the reason behind individual updates - but why=20
>> >> > not
>> batch
>> >> > them up or give a option to batch N updates in either of the=20
>> >> > above
>> case
>> >> - I
>> >> > did come across a presentation that talked about batching 10=20
>> >> > updates
>> for
>> >> > replication at least, but I do not think this is the case.
>> >> > - Asif
>> >>
>>