lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] Commented: (LUCENE-2252) stored field retrieve slow
Date Sun, 07 Feb 2010 01:05:27 GMT


Yonik Seeley commented on LUCENE-2252:

The thing about stored fields is that it's normally not inner-loop stuff.  The index may be
100M documents, but the average application pages through hits a handful at a time.  And when
loading stored fields gets really slow, it tends to be the OS cache misses due to the index
being large.  We should still optimize it if we can of course (some apps do access many fields
at once), but I agree with Robert that a direct in-memory stored field index probably wouldn't
be a good default.

John, do you have a specific use case where this is the bottleneck, or are you just looking
for places to optimize in general?

> stored field retrieve slow
> --------------------------
>                 Key: LUCENE-2252
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 3.0
>            Reporter: John Wang
> IndexReader.document() on a stored field is rather slow. Did a simple multi-threaded
test and profiled it:
> 40+% time is spent in getting the offset from the index file
> 30+% time is spent in reading the count (e.g. number of fields to load)
> Although I ran it on my lap top where the disk isn't that great, but still seems to be
much room in improvement, e.g. load field index file into memory (for a 5M doc index, the
extra memory footprint is 20MB, peanuts comparing to other stuff being loaded)
> A related note, are there plans to have custom segments as part of flexible indexing

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message