lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: UpdateHandler batch size / search solr-user
Date Tue, 19 Feb 2019 22:46:57 GMT
Sending batches in parallel is perfectly fine. _However_,
if you’re updating the same document, there’s no 
guarantee which would win.

Imagine you have two processes sending batches. The
order of execution depends on way too many variables.

If nothing else, if process 1 sends a document then some
time later process 2 sends the same document, the one from
process2 would “win”. The optimistic locking scenario wouldn’t
come into the picture unless you took  control of assigning the
_version_ number.

Best,
Erick

> On Feb 19, 2019, at 9:23 AM, David '-1' Schmid <gdkags@gmail.com> wrote:
> 
> Hi!
> 
> On 2019-02-18T20:36:35, Erick Erickson wrote:
>> Typically, people set their autocommit (hard) settings in
>> solrconfig.xml and forget about it. I usually use a time-based trigger
>> and don’t use documents as a trigger.
> I added a timed autoCommit and it seems to work out nicely. Thank you!
> 
>> Until you do a hard commit, all the incoming documents are held in the
>> transaction log,
> Ah, yes. Somehow I did not draw the link to transactions.
> I've noticed that solr is using only one of my four CPUs for applying
> the update. With that in mind, could I submit my batches in parallel,
> or would that be worse? To be honest, I've never seen what kind of
> transaction or coherency model is used in solr.
> 
> I think it's touched briefly by the solr-ref-guide for applying updates
> to single document fields; but I can't say for sure if it's using an
> optimistic strategy or if the parallel updates would produce more
> overhead by pessimistic locking.
> 
> regards,
> =1


Mime
View raw message