lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miguel B. (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1949) overwrite document fails if Solr index is not optimized
Date Wed, 16 Jun 2010 17:08:22 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879385#action_12879385
] 

Miguel B. commented on SOLR-1949:
---------------------------------

We are running a lot of test and now we can't get the same error. At this point I can say
that the issue don't exists, so we don't use expungeDeletes=true. We don't know what really
happened, it would be resolved because we removed all data directory before run new tests.


We continue with tests.

My apologies.



> overwrite document fails if Solr index is not optimized
> -------------------------------------------------------
>
>                 Key: SOLR-1949
>                 URL: https://issues.apache.org/jira/browse/SOLR-1949
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 1.4
>         Environment: linux centos
>            Reporter: Miguel B.
>
> Scenario:
> - Solr 1.4 with multicore
> - We have a set of 5.000 source documents that we want to index.
> - We send these set to Solr by SolrJ API and they are added correctly. We have field
ID as string and uniqueKey, so the update operation overwite documents with the same ID. The
result is 4500 unique documents in Solr. Also all documents have an index field that contains
the source repository of each document, we need it because we want to index another sources.
> - After add operation, we send optimization.
>  
> All works fine at this point.  Solr have 4.500 documents at Solr core (and 4.500 max
documents too).
>  
> Now these 5.000 sources documents are updated by users, and a set of them are deleted
(supose, 1000). So, now we want to update our Solr index with these change (unfortunately
our repository doesn't support an incremental approach), the operations are:
>  
>  - At index Solr, delete documents by query  (by the field that contains document source
repository). We use deleteByQuery and commit SolrJ operations.
>  - At this point Solr core have 0 documents (but 4.500 max documents, important!!!)
>  - Now we add to Solr the new version of source documents  (4000). Remember that documents
don't have unique identifiers, supose that unique items are 3000. So when add operation finish
(after commit sended) Solr index must have 3.000 unique items.
>  
> But the result isn't 3.000 unique items, we obtains a random results: 3000, 2980, 2976,
etc. It's a serious problem because we lost documents.
> We have a workaround. At these operations just after delete operation, we send an optimization
to Solr (maxDocuments are updated). After this, we send new documents. By this way the result
is always fine.
> In our tests, we can see that this issue is only when the new documents overwrites documents
that existed in solr.
> Thanks!!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message