lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: App supplied docID in lucene possible?
Date Wed, 07 Nov 2012 10:15:11 GMT
Many thanks Mike, for helping me understand the problems

I gather that only postings "may be" able to handle "sparse non-decreasing
docIDs per flush" while all other areas like stored-fields/field-cache
etc... cannot do so currently.

Can you recollect and point me to any discussions that has happened
previously on sort-docID-before-flush/sparse-doc-handling?

--
Ravi

On Tue, Nov 6, 2012 at 4:53 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Tue, Nov 6, 2012 at 1:04 AM, Ravikumar Govindarajan
> <ravikumar.govindarajan@gmail.com> wrote:
> > Looks far more complex than I had assumed!!!
>
> I think it would be a major undertaking.
>
> > An invariant of "non-decreasing docid per flush", if pushed to the app
> can
> > save lucene from handling the complex sparse data logic no?
> >
> > Lucene can hold it's existing logic without major changes, detect any
> > out-of-order doc before every flush and emit an error.
>
> We could do that, though the sparseness problem remains, eg if the app
> adds docID=1 and then docID=10000 ...
>
> > I understand that multi-threaded indexing and such concerns also need to
> be
> > handled by the app, but thats what apps get when trying to control docIDs
>
> In 4.0 (an improvement over 3.6) each thread writes to its own private
> segment ... so as long as one thread always indexed docs in monotonic
> order that could in theory work ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message