lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: [jira] Updated: (LUCENE-843) improve how IndexWriter uses RAM to buffer added documents
Date Fri, 23 Mar 2007 05:29:18 GMT
: > Actually is #2 a hard requirement?
:
: A lot of Lucene users depend on having document number correspond to
: age, I think.  ISTR Hatcher at least recommending techniques that
: require it.

"Corrispond to age" may be missleading as it implies that the actual
docid has meaning ... it's more that the relative order of addition is
preserved regardless of deletions/merging

A trivial example of using this is getting the N newest documents matching
a search using a HitCollector, it's just a bounded queue that only
remembers the last N things you put in it.

An more complicated example is duplicate unique field detection: iterating
over a TermDoc and knowing that the doc with the higheest docId is the
last one added, so the earlier ones can be ignored/deleted.  (as i recall,
Solr takes advantage of this.)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message