lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Lucene's internal doc ID space
Date Sat, 12 May 2012 20:12:31 GMT
On Sat, May 12, 2012 at 9:12 AM, Valeriy Felberg
<valeri.felberg@gmail.com> wrote:
>> the Document IDs in Lucene are per segment. ie. they are always
>> segment based.
>
> @Simon I'm just wondering: If the document IDs are per segment how
> does it work if I call Searcher.search(Query, int) and get TopDocs
> referencing ScoreDocs which contain document IDs? What happens if
> there are two matching documents in different segments? How does
> Lucene know which segment is meant if I call Searcher.doc(docId) with
> some docId from the search result?

The per-segment docIDs are "rebased" before Searcher.search returns,
ie turned into global docID against the top reader.

Also: when a merge runs, it removes any deleted docIDs (thus
renumbering all non-deleted docIDs)...

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message