lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Possible improvement: TrieDate without time of day
Date Wed, 19 Dec 2012 14:41:07 GMT
On Wed, Dec 19, 2012 at 2:46 PM, Jack Krupansky <jack@basetechnology.com> wrote:
> Although your comment seemed to imply that the new 4.1 postings format would
> store day-style dates more efficiently - could you summarize what effects we
> could see?

My point was that the 4.1 format compresses dense postings lists much
more efficiently.  If you set all your hours, minutes, seconds and
milliseconds to 0, there will be fewer distinct terms (assuming your
precisionStep is < 64), so the terms dictionary will be smaller, and
postings lists will be denser, so more space-efficient. And if you
later decide that you need second-granularity, you'll just need to
stop setting seconds to 0 without having to reindex.

However, writing a new 32-bits field type with day granularity as Uwe
suggests is the way to go to index day-only dates efficiently.

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message