lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nigel <nigelspl...@gmail.com>
Subject Re: Will doc ids ever change if nothing is deleted?
Date Fri, 14 May 2010 02:12:35 GMT
Yes, I realize that storing document IDs persistently (for example) is a Bad
Idea. Partly I'm asking just to make sure I understand what's going on.

There is a use case, though.  In some cases between when we do a search and
return some doc ids, and when we use those doc ids to load some documents,
the index could be reopened.  Normally the same IndexReader instance would
be used for both the searching and the loading and so you'd be assured that
the doc ids would be stable.  But sometimes when searches are distributed
across multiple remote indexes (and non-Lucene search systems) some
aggregation needs to occur before results are loaded, and references to the
IndexReaders can't be maintained across that process.  Currently we remember
the index version associated with a search result (i.e. a set of document
ids) so we can verify when loading the documents that the version is the
same, and therefore the IDs are still valid.  I'm wondering if this is
overly restrictive.  For example, if I knew that no documents had been
deleted, and if (per my original question) only deletions would trigger
renumbering, then the doc ids from a search result could be used on an index
with a newer version.

Thanks,
Chris

On Thu, May 13, 2010 at 9:51 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> Why do you care? That is, what do you want to accomplish
> that makes document ID renumbering relevant?
>
> In general, it is unwise to rely on Lucene-assigned document
> IDs. If you need an invariant document ID, assign it yourself.
>
> If this is off base, could you supply your use-case?
>
> Best
> Erick
>
> On Thu, May 13, 2010 at 9:38 PM, Nigel <nigelspleen@gmail.com> wrote:
>
> > The FAQ clearly states that document IDs will not be re-assigned unless
> > something was deleted.
> >
> >
> http://wiki.apache.org/lucene-java/LuceneFAQ#When_is_it_possible_for_document_IDs_to_change.3F
> >
> > However, a number of other emails and posts I've read mention that
> > renumbering occurs when segments are merged.  Maybe what people mean
> > is simply that when something is deleted, nothing is immediately
> > renumbered,
> > and it's not until merge time that the renumbering happens.  (This is my
> > understanding of how it works.)
> >
> > Just so I'm 100% clear, if I never delete anything, will the IDs ever
> > change?
> >
> > Thanks,
> > Chris
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message