lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4203) Add IndexWriter.tryDeleteDocument, to delete by document id when possible
Date Mon, 09 Jul 2012 20:19:33 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409807#comment-13409807
] 

Michael McCandless commented on LUCENE-4203:
--------------------------------------------

bq. Whats the advantage of passing the SIPC here? Wouldnt it be cleaner to just take SegmentReader
and steal it from there?

Oh, no advantage ... I agree passing SR would be better (then we can leave SR.getSI() as package
private).  It shouldn't be any hardship either because you should use an NRT reader from IW
anyway to even find the docID to delete.
                
> Add IndexWriter.tryDeleteDocument, to delete by document id when possible
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-4203
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4203
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4203.patch
>
>
> Spinoff from LUCENE-4069.
> In that use case, where the app needs to first lookup a document, then
> call updateDocument, it's wasteful today because the relatively costly
> lookup (by a primary key field, eg "id") is done twice.
> But, since you already resolved the PK to docID on the first lookup,
> it would be nice to then delete by that docID and then you can call
> addDocument instead.
> So I worked out a rough start at this, by adding
> IndexWriter.tryDeleteDocument.  It'd be a very expert API: it takes a
> SegmentInfo (referencing the segment that contains the docID), and as
> long as that segment hasn't yet been merged away, it will mark the
> document for deletion and return true (success).  If it has been
> merged away it returns false and the app must then delete-by-term.  It
> only works if the writer is in NRT mode (ie you've opened an NRT
> reader).
> In LUCENE-4069 using tryDeleteDocument gave a ~20% net speedup.
> I think tryDeleteDocument would also be useful when Solr "updates" a
> document by loading all stored fields, changing them, and calling
> updateDocument.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message