lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Li" <ning.li...@gmail.com>
Subject Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)
Date Fri, 14 Jul 2006 05:31:15 GMT
> Solr's implementation is here:
> http://svn.apache.org/viewvc/incubator/solr/trunk/src/java/org/apache/solr/update/DirectUpdateHandler2.java?view=markup

I read it and I see which point I didn't make clear. :-)

I have viewed "delete by term" (which is supported by IndexReader and
NewIndexModifier) as a kind of "delete by query", not "delete by id".

If I replace Term in DeleteTerm with Query (or query string), and
re-define applyDeletesSelectively() as follows (this is not the best
possible implementation, but you get the idea):
  protected void applyDeletesSelectively(Vector deleteQueries,
      IndexReader reader) throws IOException {
    Searcher searcher = new IndexSearcher(reader);
    for (int i = 0; i < deleteQueries.size(); i++) {
      Hits hits = searcher
          .search(((DeleteQuery)deleteQueries.elementAt(i)).query);
      for (int j = 0; j < hits.length(); j++) {
        int doc = hits.id(j);
        if (doc <= (((DeleteQuery)deleteQueries.elementAt(i)).maxSegment)) {
          reader.deleteDocument(doc);
        }
      }
    }
  }

And also re-define applyDeletes() accordingly, then NewIndexModifier
would be able to support general "delete by query", and support it
efficiently. By efficient, it means NewIndexModifier can buffer the
queries and apply them in relatively larger batches if the delay
caused by buffering is acceptable by the application.

I hope the point is clear now. :-) The patch was designed so that it
can support "delete by term" and even "delete by query" in general,
and I haven't found out a way to achieve this functionality using only
the public APIs of IndexReader and IndexWriter.

As you commented in DirectUpdateHandler2, "deleteByQuery causes a
commit to happen (close current index writer, open new index reader)
before it can be processed. If deleteByQuery functionality is needed,
it's best if they can be batched and executed together so they may
share the same index reader.". Well, if you use NewIndexModifier, it
can batch the delete queries for the application to certain extent
(maxBufferedDocs and maxBufferedDeleteQueries...).

Regards,
Ning

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message