lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Document-Ids and Merges
Date Tue, 27 Mar 2012 16:15:24 GMT
In general how Lucene assigns docIDs is a volatile implementation
detail: it's free to change from release to release.

Eg, the default merge policy (TieredMergePolicy) merges out-of-order
segments.  Another eg: at one point, IndexSearcher re-ordered the
segments on init.  Another: because ConcurrentMergeScheduler runs
different merges in different threads, they can finish in different of
orders and thus alter how subsequent merges are selected.

Really it's best if you assign your own (app-level) ID field and use
that, if you need a stable ID.

Mike McCandless

On Tue, Mar 27, 2012 at 3:29 AM, Christoph Kaser
<> wrote:
> Hi all,
> I have a search application with 16 million documents that uses custom
> scores per document using a ValueSource. These values are updated a lot (and
> sometimes all at once), so I can't really write them into the index for
> performance reasons. Instead, I simply have a huge array of float values in
> memory and use the document ID as index in the array.
> This works great as long as the index is not changed, but as soon as I have
> a few new documents and deletions, index segments are merged (I suppose) and
> the document IDs of existing documents change. Is there any way to be
> informed when document IDs of existing documents change? If so, is there a
> way to calculate the new document ID from the old one, so I can "convert" my
> array to the new document IDs?
> Any help would be greatly appreciated!
> Best regards,
> Christoph
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message