lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Possible improvement: TrieDate without time of day
Date Wed, 19 Dec 2012 10:05:06 GMT
Hi Jack,

On Sat, Dec 15, 2012 at 4:36 PM, Jack Krupansky <jack@basetechnology.com> wrote:
> I have seen a few inquiries concerned with the overhead of storing time of
> day for simple dates. The concerns are both storage and performance. So, the
> question/proposal is whether a variant of TrieDate with no time of day
> component, call it TrieDay or TrieDateTimeless or TrieDateNoTime (or
> incompatibly rename TrieDate to TrieDateTime and use TrieDate for the new
> format), could be stored with, say, 40% more storage efficiency and maybe a
> comparable or at least significant performance improvement for queries.

Storing only the day in a 32-bits integer could save space, but I'm
not sure Solr should provide a type for all granularities of dates?
Have you tried to set your dates' hours, minutes, seconds and
milliseconds to 0 before indexing them ? This should help postings
lists share terms and improve storage efficiency (especially with the
new Lucene41PostingsFormat).

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message