lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Li" <>
Subject Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)
Date Sat, 15 Jul 2006 01:03:31 GMT
> Hey, you're moving the goalposts ;-)
> You proposed a specific patch, and it certtainly doesn't have support
> for delete-by-query.

The patch makes IndexWriter support delete-by-term, which is what
IndexReader supports. Granted, delete-by-term is not as general as
delete-by-query so you don't have to use a searcher to identify the
docs to delete. However, it still requires an IndexReader to identify
the docs to delete so is not delete-by-id and still requires more than
the public APIs to achieve the goal. So, I think I've always had the
same goal, and delete-by-query is a bonus. :-)

> If one is going to be able to support deleteByQuery, why not a full
> IndexSearcher/IndexWriter combination?

What do you mean by "a full combination"?

> As far as implementation, right now NewIndexModifier overrides and
> reimplements much of the guts of IndexWriter.  Is there a way of
> lowering that profile by providing some extension points, or places to
> hook into IndexWriter events (like before the ram segment is going to
> be flushed)?

Sounds promising. However, some surgery to IndexWriter is still
required. For example, because buffered delete-by-terms or
delete-by-queries are handled differently for ram segments vs. for
disk segments, what you said will be easier to achieve if the
distinction of ram segments and disk segments is more explicit as in
the patch. In current IndexWriter, a flush triggered in
maybeMergeSegments() may merge together not only ram segments, but
also some disk segments.

> Maybe IndexWriter could call a specific method on a
> callback interface with List<Reader> that returns a list of document
> ids to delete (through an efficient interface such as HitCollector or
> Matcher of course).

With what List<Reader>? Could you elaborate?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message