lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] Commented: (LUCENE-1879) Parallel incremental indexing
Date Fri, 26 Mar 2010 19:22:27 GMT


Shai Erera commented on LUCENE-1879:

The way I planned to support multi-threaded indexing is to do a two-phase addDocument. First,
allocate a doc ID from DocumentsWriter (synchronized) and then add the Document to each Slice
with that doc ID. DocumentsWriter was not suppose to know it is a parallel index ... something
like the following.
int docId = obtainDocId();
for (IndexWriter slice : slices) {
  slice.addDocument(docId, Document);

That allows ParallelWriter to be really an orchestrator/manager of all slices, while each
slice can be an IW on its own.

Now, when you say ParallelDocumentsWriter, I assume you mean that that DocWriter will be aware
of the slices? That I think is an interesting idea, which is unrelated to LUCENE-2324. I.e.,
ParallelWriter will invoke its addDocument code which will get down to ParallelDocumentWriter,
which will allocate the doc ID itself and call each slice's DocWriter.addDocument? And then
LUCENE-2324 will just improve the performance of that process?

This might require a bigger change to IW then I had anticipated, but perhaps it's worth it.

What do you think?

> Parallel incremental indexing
> -----------------------------
>                 Key: LUCENE-1879
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Michael Busch
>            Assignee: Michael Busch
>             Fix For: 3.1
>         Attachments: parallel_incremental_indexing.tar
> A new feature that allows building parallel indexes and keeping them in sync on a docID
level, independent of the choice of the MergePolicy/MergeScheduler.
> Find details on the wiki page for this feature:
> Discussion on java-dev:

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message