lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jayson Minard (JIRA)" <>
Subject [jira] Commented: (SOLR-1162) SolrJ does not maintain order of operations when using an UpdateRequest object to send them in bulk
Date Wed, 13 May 2009 14:00:45 GMT


Jayson Minard commented on SOLR-1162:

Multiple requests are less efficient than sending large batches together.  

To be the most efficient with large requests, every user of SolrJ UpdateRequest would need
to write the same logic...  Place adds into UpdateRequest until you hit the first non-add,
then send the UpdateRequest and start writing your deletes until you hit a non-delete, then
flush the UpdateRequest and keep adding your new transaction type until you hit the first
...  In that case they should avoid using UpdateRequest altogether as calling the SolrServer
directly is just as "easy."  If we are going to batch on their behalf why wouldn't we do it
correctly and be predictable with our ordering.   I'm sure if JDBC batches did not maintain
order, there would be havoc to pay...

Besides that, it isn't clear to users of UpdateRequest as to the order of operations, so someone
doing an Add doc 1, Delete doc 1, Add doc 1 may not end up with the expected outcome.   It
turns into Add doc 1, Add doc 1, Delete doc1 when streaming and similary for XML version of
the transaction.  If I did a Delete Query *:* then Add doc1, Add doc 2 I end up with no docs
as the delete query comes last, but I (the user) does not know that.  

I've written code to work around UpdateRequest ordering and I usually end up only using it
for commitWithin or having a commit tacked on the end of the request due to the above issues.

> SolrJ does not maintain order of operations when using an UpdateRequest object to send
them in bulk
> ---------------------------------------------------------------------------------------------------
>                 Key: SOLR-1162
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.3
>            Reporter: Jayson Minard
>         Attachments: Solr-1162.patch, Solr-1162.patch
> In SolrJ UpdateRequest object it maintains separate lists of documents to add, delete,
and delete queries so that the order of those operations is not known to the caller.  It really
should execute the items in the same order they were added to the UpdateRequest.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message