lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter
Date Tue, 23 Jun 2009 07:36:07 GMT


Uwe Schindler commented on LUCENE-1712:

In my opinion, I would like to keep the precisionStep parameter required and give the 4 constants
for each data type in NumericUtils.

On the other hand 4 is a (maybe) good default, so I propose, that all ctors/factories getting
a precisionStep default it to 4, if left out. precisionStep is a final variable in NumericTokenStream
(and so in NumericField), because it does not make sense to change it. If "field" is final,
also precisionStep should be final (one field must always use the same precision step). In
principle Mike is right, the type is also fixed after first calling setXxxValue, so I could
throw an IAE, if somebody calles a setter for a different datatype after the first one. A
IllegalStateEx is thrown, when the value was not initialized and the docinverter tries to
use the token stream.

Here are two ideas to fix this the defaultPrecStep per type:

1. Special value 0 as default precStep:
- the no-precStep ctor sets the precStep in NumTokenStream to 0 (invalid value), if one is
given it must be >0 and <=valsize
- when delivering the tokens, NumTokenStream uses the default for this data type if precStep==0
and the given value in all other cases
In this case the precStep is still final in NumericTokenStream, with 0 means "automatic".

2. There is one other idea:
NumericField/-TokenStream could have a required ctor param type that can be NumericField.Type.INT,...
In this case the default could be choosen very simple at the beginning. And it also fixes
the data type. If somebody calls setDoubleValue but has initialized the TokenStream with NumericField.Type.INT,
he will get an UOE.

The javadocs should always clearly note, that one should check out a good precStep.

By the way: It is also a good idea to use valSize (32 or 64) as precisionStep in the case
that you do not want to do range queries on the field (and use it only for sorting). RangeQueries
would still work, but are as slow as conventional ones (current solr trunk contains this hint
in its TrieField docs/schema)

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>                 Key: LUCENE-1712
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message