lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@gmail.com>
Subject Re: delete by docid in lucene 4
Date Thu, 12 Jul 2012 07:42:26 GMT
On Thu, Jul 12, 2012 at 3:09 AM, Sean Bridges <sean.bridges@gmail.com> wrote:
> Is it possible to delete by docId in lucene 4?  I can delete by docid
> in lucene 3 using IndexReader.deleteDocument(int docId), but that
> method is gone in lucene 4, and IndexWriter only allows deleting by
> Term or Query.

that is correct. In lucene 4 IndexReader is really just a reader!
>
> This is our use case -  In our system, each document is identified by
> a unique serial id.  If an error occurs, we may index the same message
> multiple times.  When an index grows large enough, we stop adding to
> it, and optimize the index.  During optimization, if we see multiple
> docs with the same serialid, we delete all but the first, as all
> documents with the same serialid are the same.

I am wondering why you don't use the IW#updateDocument(Term,Doc)
method? do you rely on multiple versions of the same doc? With Lucene
4 relying on the doc id can become very tricky. If you use multiple
threads you create a lot of segments which can be merged in any order.
You can't tell if a document ID maintains happened-before semantics at
all.

Can you tell us more about your usecase and why you are using deleteByDocID

simon


>
> Thanks,
>
> Sean
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message