lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <>
Subject [jira] [Resolved] (LUCENE-1034) Add new API method to retrieve document field data in a batch
Date Sun, 10 Mar 2013 13:25:13 GMT


Erick Erickson resolved LUCENE-1034.

    Resolution: Won't Fix

SPRING_CLEANING_2013 We can reopen if necessary.
> Add new API method to retrieve document field data in a batch
> -------------------------------------------------------------
>                 Key: LUCENE-1034
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: 2.2
>         Environment: JDK 1.5.X, Linux & FreeBSD
>            Reporter: Michael Klatt
>            Priority: Minor
>         Attachments:,, LUCENE-1034.patch,,
> I've read in many forums about people who need to retrieve document data for a large
number of search results. In our case, we need to retrieve up to 10,000 results (sometimes
more) from an index of over 100 million documents (our index is about 65 GB).   This can sometimes
take a couple minutes. 
> In one of my attempts to improve performance, I modified the IndexReader interface to
provide a method which looks like:
> public Document[] documents(int[] n, FieldSelector fieldSelector);
> Instead of retrieving document data one at a time, I would request data for many document
numbers in one shot.   The idea was to optimize the seeks on disk so that in the FieldsReader,
the seeks for the indexStream would be done first, then all the seeks in the fieldStream would
be completed.   For a large number of documents, this yielded a 20% speed improvement.  The
improvement was not as much as I was looking for, but I felt that the improvement was significant
enough that I would request changes to the IndexReader interface.
> I'm providing patches for the files that I needed to change for our application.    These
patches are against the 2.2 release.  

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message