lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4539) DocValues impls should read all headers up-front instead of per-directsource
Date Tue, 06 Nov 2012 00:34:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491089#comment-13491089
] 

Adrien Grand commented on LUCENE-4539:
--------------------------------------

Interesting, I had started to work on the _exact_ same API for LUCENE-4536 (because I needed
to read the header generated by getWriter to know how long a packed array would be depending
on the version) but I finally decided to fix it another way.

bq. not sure if this would be useful outside of this particular issue

Maybe not but I agree that it would help make the code cleaner, so +1!

Related discussion: I think the header format is fine when packed ints are in their own file,
but when packed ints are nested in another file, they should probably not declare a codec
header: the PackedInts codec name check is redundant with the main codec name check. (I was
actually thinking of deprecating get{Reader,DirectReader,ReaderIterator} to force callers
to think about the way they should store the valueCount and bitsPerValue, and to discourage
from using a standard header when packed ints are not in their own file to prevent from performing
redundant checks.)

                
> DocValues impls should read all headers up-front instead of per-directsource
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-4539
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4539
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Robert Muir
>         Attachments: LUCENE-4539.patch
>
>
> Currently, when DocValues opens, it just opens files. it doesnt read codec headers etc.
> Instead we read these every single time a directsource opens. 
> I think it should work like PostingsReaders: e.g. the PackedInts impl would read its
versioning info and codec headers and creating a new Direct impl should be a IndexInput.clone()
+ getDirectReaderNoHeader().
> Today its much more costly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message