lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: ParallelReader
Date Mon, 21 Feb 2011 08:53:52 GMT
Hi David,

With current Lucene versions, the usage of ParallelReader is very
complicated to keep in sync. The problem is how merges occur. For
ParallelReader to work, all internal document ids (the integers) must be
parallel. As the new MergePolicies now work on size of documents and also
may work concurrent, it's almost impossible to have all merges also done in
parallel s internal doc ids keep the same, so ParallelReader, as it is, is
currently only working with carefully optimized indexes. Also it is not
really useable for your usecase at the moment.

There are approaches in Lucene trunk to support updateable fields (so called
parallel indexing), but this is not yet working. Please search in JIRA for
corresponding issues.

Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen

> -----Original Message-----
> From: David Saile []
> Sent: Monday, February 21, 2011 9:39 AM
> To:
> Subject: ParallelReader
> Hello everybody,
> I was wondering, if someone could point me to what I need to be aware of,
> using a ParallelReader.
> My intention is to modify Nutch ( in a way, that
> the Lucene-index Nutch uses, only documents for changed websites are
> updated.
> However, due to the existing scoring-algorithms, most page's page-score
> change. After doing some research about updating single fields in a
> index, I found
> java/ParallelIncrementalIndexing
> This brought me to the idea, to create a separate index for the
> Are there maybe any other approaches around, that I overlooked? What do I
> need to be aware of, when using two parallel indices?
> Thanks for any help!
> David
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message