lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey Kudryavtsev (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-11459) AddUpdateCommand#prevVersion is not cleared which may lead to problem for in-place updates of non existed documents
Date Fri, 19 Jan 2018 17:48:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-11459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332599#comment-16332599
] 

Andrey Kudryavtsev edited comment on SOLR-11459 at 1/19/18 5:47 PM:
--------------------------------------------------------------------

{quote}The author, somehow tell us that the update should create this doc. It's not clear
whether it's described as an expected behavior. But even it is, can we have clearer assertNotNull()? 
{quote}
No problem. 
{quote}Also, can't we don't -copy-paste- introduce {{indexDocs()}} but rather loop through
docs? 
{quote}
Not sure about this one. I need a batch of updates on a same Solr node to see problems
when same instance of {{AddUpdateCommand}} is reused for different updates. So it's like
{{add(Collection<SolrInputDocument> docs)}} must have here. 

 


was (Author: werder):
{quote}

The author, somehow tell us that the update should create this doc. It's not clear whether
it's described as an expected behavior. But even it is, can we have clearer assertNotNull()? 

{quote}

No problem. 

{quote}

Also, can't we don't -copy-paste- introduce {{indexDocs()}} but rather loop through docs? 

{quote}

Not sure about this one. I need a batch of updates on a same Solr node to see problems
when same instance of {{AddUpdateCommand}} is reused for different updates. So it's like
\{{add(Collection<SolrInputDocument> docs)}} is must have here. 

 

> AddUpdateCommand#prevVersion is not cleared which may lead to problem for in-place updates
of non existed documents
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11459
>                 URL: https://issues.apache.org/jira/browse/SOLR-11459
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 7.0
>            Reporter: Andrey Kudryavtsev
>            Assignee: Ishan Chattopadhyaya
>            Priority: Minor
>
> I have a 1_shard / *m*_replicas SolrCloud cluster with Solr 6.6.0 and run batches of
5 - 10k in-place updates from time to time. 
> Once I noticed that job "hangs" - it started and couldn't finish for a a while.
> Logs were full of messages like:
> {code} Missing update, on which current in-place update depends on, hasn't arrived. id=__,
looking for version=___, last found version=0"  {code}
> {code} 
> Tried to fetch document ___ from the leader, but the leader says document has been deleted.
Deleting the document here and skipping this update: Last found version: 0, was looking for:
___",24,0,"but the leader says document has been deleted. Deleting the document here and skipping
this update: Last found version: 0
> {code}
> Further analysis shows that:
> * There are 100-500 updates for non-existed documents among other updates (something
that I have to deal with)
> * Leader receives bunch of updates and executes this updates one by one. {{JavabinLoader}}
which is used by processing documents reuses same instance of {{AddUpdateCommand}} for every
update and just [clearing its state at the end|https://github.com/apache/lucene-solr/blob/e2521b2a8baabdaf43b92192588f51e042d21e97/solr/core/src/java/org/apache/solr/handler/loader/JavabinLoader.java#L99].
Field [AddUpdateCommand#prevVersion| https://github.com/apache/lucene-solr/blob/6396cb759f8c799f381b0730636fa412761030ce/solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java#L76]
is not cleared.   
> * In case of update is in-place update, but specified document does not exist, this update
is processed as a regular atomic update (i.e. new doc is created), but {{prevVersion}} is
used as a {{distrib.inplace.prevversion}} parameter in sequential calls to every slave in
DistributedUpdateProcessor. {{prevVersion}} wasn't cleared, so it may contain version from
previous processed update.
> * Slaves checks it's own version of documents which is 0 (cause doc does not exist),
slave thinks that some updates were missed and spends 5 seconds in [DistributedUpdateProcessor#waitForDependentUpdates|https://github.com/apache/lucene-solr/blob/e2521b2a8baabdaf43b92192588f51e042d21e97/solr/core/src/java/org/apache/solr/handler/loader/JavabinLoader.java#L99]
waiting for missed updates (no luck) and also tries to get "correct" version from leader (no
luck as well) 
> * So update for non existed document costs *m* * 5 sec each
> I workarounded this by explicit check of doc existence, but it probably should be fixed.
> Obviously first guess is that  prevVersion should be cleared in {{AddUpdateCommand#clear}},
but have no clue how to test it.
> {code}
> +++ solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java	(revision )
> @@ -78,6 +78,7 @@
>       updateTerm = null;
>       isLastDocInBatch = false;
>       version = 0;
> +     prevVersion = -1;
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message