lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: precisionStep for days in TrieDate
Date Fri, 14 Dec 2012 23:11:21 GMT
Thanks, you answered the main question - 26 doesn't simply lop off the time 
of day. Although, I still don't completely follow how trie works (without 
reading the paper itself.)

-- Jack Krupansky

-----Original Message----- 
From: Uwe Schindler
Sent: Friday, December 14, 2012 5:58 PM
To: java-user@lucene.apache.org
Subject: RE: precisionStep for days in TrieDate

Hi,

> If I specify a precisionStep of 26 for a TrieDate field, what rough impact
> should this have on both performance and index size?

This value is mostly useless, everything > 8 does slowdown the queries tot 
he speed of TermRangeQuery.

> The input data has time in it, but the milliseconds per day is not needed 
> for
> the app. Will Lucene store only the top 64 minus 26 bits of data and 
> discard
> the low 26 bits?

No, you may need to read the Javadocs of NumericRangeQuery, now updated with 
formulas: http://goo.gl/nyXQR
The precisionStep is a count, after how many bits of the indexed value a new 
term starts. The original value is always indexed in full precision. 
Precision step of 4 for a 32 bit value(integer) means terms with these bit 
counts:
All 32, left 28, left 24, left 20, left 16, left 12, left 8, left 4 bits of 
the value (total 8 terms/value). A precision step of 26 would index 2 terms: 
all 32 bits and one single term with the remaining 6 bits from the left.

> I’ve read that a higher precisionStep will lower performance. Will a
> precisionStep of 26 have dramatically lower performance when referencing
> days (without time of day)?

See above. The assumption that 26 will limit precision to days is wrong.

> I suppose that the piece of information I am missing is whether trie
> precisionStep simply affects some extra index table that trie keeps beyond
> the raw data values or the data values themselves.

It only affects how the value is indexed (how many terms), but not the 
value.

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message