lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4539) DocValues impls should read all headers up-front instead of per-directsource
Date Tue, 06 Nov 2012 00:34:14 GMT


Adrien Grand commented on LUCENE-4539:

Interesting, I had started to work on the _exact_ same API for LUCENE-4536 (because I needed
to read the header generated by getWriter to know how long a packed array would be depending
on the version) but I finally decided to fix it another way.

bq. not sure if this would be useful outside of this particular issue

Maybe not but I agree that it would help make the code cleaner, so +1!

Related discussion: I think the header format is fine when packed ints are in their own file,
but when packed ints are nested in another file, they should probably not declare a codec
header: the PackedInts codec name check is redundant with the main codec name check. (I was
actually thinking of deprecating get{Reader,DirectReader,ReaderIterator} to force callers
to think about the way they should store the valueCount and bitsPerValue, and to discourage
from using a standard header when packed ints are not in their own file to prevent from performing
redundant checks.)

> DocValues impls should read all headers up-front instead of per-directsource
> ----------------------------------------------------------------------------
>                 Key: LUCENE-4539
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Robert Muir
>         Attachments: LUCENE-4539.patch
> Currently, when DocValues opens, it just opens files. it doesnt read codec headers etc.
> Instead we read these every single time a directsource opens. 
> I think it should work like PostingsReaders: e.g. the PackedInts impl would read its
versioning info and codec headers and creating a new Direct impl should be a IndexInput.clone()
+ getDirectReaderNoHeader().
> Today its much more costly.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message