lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arnold Bronley <arnoldbron...@gmail.com>
Subject Re: Not able reproduce race condition issue to justify implementation of optimistic concurrency
Date Fri, 16 Nov 2018 22:16:49 GMT
Thanks for replying, Chris.

1) depending on the number of CPUs / load on your solr server, it's
possible you're just getting lucky. it's hard to "prove" with a
multithreaded test that concurrency bugs exist.

- Agreed. However, between 200k total calls, race condition not happening
even once - I feel 'too' lucky.

2)  a lot depends on what your updates look like (ie: the impl of
SolrDocWriter.atomicWrite()), and what the field definitions look like.

If you are in fact doing "atomic updates" (ie: sending a "set" command on
the field) instead of sending the whole document *AND* if the fields f1 &
f2 are fields that only use docValues (ie: not stored or indexed) then
under the covers you're getting an "in-place" update in which (IIRC) it's
totally safe for the 2 updates to happen concurrently to *DIFFERENT*
fields of the same document.

- atomicWrite() function is just a simple wrapper function that adds set
and other appropriate atomic operators before indexing payload.
- I am not using docValues for these fields. Here are their definitions:
   <field name="average_rating" type="tfloat" stored="true" indexed="true"
/>
<field name="fresh_score" type="tfloat" stored="true" indexed="true" default
="0.0" />
  So I don't think I am getting benefited with in-place updates.

- I will give a try for scenario of two different threads updating one
single field of same document instead of two different threads writing two
different fields on same document.
- I was actually worried about performance issue because I do batch
indexing and I will need to send the whole batch again if any single
document fails with 409 response in that batch. Or otherwise I will need to
somehow retry for the document that failed with 409 response. Although,
identifying which document failed is only possible by parsing the 409
response message string which doesn't seem like a good of way of doing it.




On Fri, Nov 16, 2018 at 1:10 PM Chris Hostetter <hossman_lucene@fucit.org>
wrote:

>
> 1) depending on the number of CPUs / load on your solr server, it's
> possible you're just getting lucky. it's hard to "prove" with a
> multithreaded test that concurrency bugs exist.
>
> 2) a lot depends on what your updates look like (ie: the impl of
> SolrDocWriter.atomicWrite()), and what the field definitions look like.
>
> If you are in fact doing "atomic updates" (ie: sending a "set" command on
> the field) instead of sending the whole document *AND* if the fields f1 &
> f2 are fields that only use docValues (ie: not stored or indexed) then
> under the covers you're getting an "in-place" update in which (IIRC) it's
> totally safe for the 2 updates to happen concurrently to *DIFFERENT*
> fields of the same document.
>
> Where you are almost certainly going to get into trouble, even if you are
> leveraging "in-place" updates under the hood, is if 2 diff threads try to
> update the *SAME* field -- even if the individual threads don't try to
> assert that the final count matches their expected count, you will likely
> wind up missing some updates (ie: the final value may not be equal the sum
> of the total incremements from both threads)
>
> Other problems will exist in cases where in-place updates can't be used
> (ie: if you also updated a String field when incrememebting your numeric
> counter)
>
> The key thing to remember is that there is almost no overhead in using
> optimistic concurrency -- *UNLESS* you encounter a collision/failure.  If
> you are planning on having concurrent indexing clients reading docs from
> solr, modifying them, and writing back to solr -- and there is a change
> multiple client threads will touch the same document, then the slight
> addition of optimistic concurrency params to the updates & retrying on
> failure is a trivial addition to the client code, and shouldn't have a
> noticable impact on performance.
>
>
>
> : Before implementing optimistic concurrency solution, I had written one
> test
> : case to check if two threads atomically writing two different fields (say
> : f1 and f2) of the same document (say d) run into conflict or not.
> : Thread t1 atomically writes counter c1 to field f1 of document d, commits
> : and then reads the value of f1 and makes sure that it is equal to c1. It
> : then increments c1 by 1 and resumes until c1 reaches to say 1000.
> : Thread t2 does the same, but with counter c2 and field f2 but with same
> : document d.
> : What I observed is the assertion of f1 = c1 or f2 = c2 in each loop never
> : fails.
> : I increased the max counter value to even 100000 instead of mere 1000 and
> : still no conflict
> : I was under the impression that there would often be conflict and that is
> : why I will require optimistic concurrency solution. How is this possible?
> : Any idea?
> :
> : Here is the test case code:
> :
> : https://pastebin.com/KCLPYqeg
> :
>
> -Hoss
> http://www.lucidworks.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message