lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENE-2679) IndexWriter.deleteDocuments should have option to not apply to docs indexed in the current IW session
Date Fri, 01 Oct 2010 16:23:32 GMT
IndexWriter.deleteDocuments should have option to not apply to docs indexed in the current
IW session
-----------------------------------------------------------------------------------------------------

                 Key: LUCENE-2679
                 URL: https://issues.apache.org/jira/browse/LUCENE-2679
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Michael McCandless


In LUCENE-2655 we are struggling with how to handle buffered deletes,
with the new per-thread RAM buffers (DWPT).

But, the only reason why we must maintain a map of del term -> current
docID (or sequence ID) is to correctly handle the interleaved adds &
deletes case.

However, I suspect that for many apps that interleaving never happens.
Ie, most apps delete only docs from *before* the last commit or NRT
reopen.  For such apps, we don't need a Map... we just need a Set of
all del terms to apply to past segments but not to the currently
buffered docs.

And, importantly, with LUCENE-2655, this would be a single Set, not
one per DWPT.  It should be a a healthy RAM reduction on buffered
deletes, and should make the deletes call faster (add to one set instead of
N maps).

We of course must still support the interleaved case, and I think it
should be the default, but I think we should provide the option for
the common-case apps to take advantage of much less RAM usage.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message