lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4936) docvalues date compression
Date Fri, 19 Apr 2013 15:53:16 GMT


Robert Muir commented on LUCENE-4936:

First thing that sticks out is maybe to remove the extra pass? Even though it just pulls
the first value...

For the DiskDV codec i feel its simple, just remove the loop and look for 'count == 0'.
For Lucene42, its probably best to just add 'count' for the same reason?

But if it makes things more confusing, maybe just leave it the way it is. Its a little tricky
either way :)
> docvalues date compression
> --------------------------
>                 Key: LUCENE-4936
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Robert Muir
>            Assignee: Adrien Grand
>             Fix For: 4.4
>         Attachments: LUCENE-4936.patch, LUCENE-4936.patch, LUCENE-4936.patch, LUCENE-4936.patch
> DocValues fields can be very wasteful if you are storing dates (like solr's TrieDateField
does if you enable docvalues) and don't actually need all the precision: e.g. "date-only"
fields like date of birth with no time component, time fields without milliseconds precision,
and so on.
> Ideally we'd compute GCD of all the values to save space (numberOfTrailingZeros is not
really enough here), but i think we should at least look for values like 86400000, 3600000,
and 1000 to be practical.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message