lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinicius Barros (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-1768) NumericRange support for new query parser
Date Sat, 06 Aug 2011 05:27:27 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080357#comment-13080357
] 

Vinicius Barros commented on LUCENE-1768:
-----------------------------------------

Sorry, I have been in silence for so long time, trying to get some results, but many doubts
showed up, I need to guidance here.

First, I started working on implementing >=, <=, <, > and = operator for standard
query parser. Then I later realized someone had submitted a patch for that already and I stopped
working on it.

Then I decided to implement support for date resolutions for numeric queries in query parser.
I started by changing NumberDateFormat to receive a resolution parameter (DAY, SECONDS, MINUTES,
etc) and this new parameter is taken into account when doing the date conversion. For that,
I added a new method to do the date rounding that takes TimeZone into account, since the current
round methods in DateTools do not support timezone. I was able to make it work up to this
part.

After that, I started to work on date compression, as you suggested before Uwe. For example,
if the user wants a DAY resolution, NumberDateFormat should only return the number of days
for the given date since 1970, not the miliseconds. For SECOND resolution, it's easy, just
divide the miliseconds by 1000. For minutes the same, divide the miliseconds by 1000 * 60
and so on. However, when I got to month, I have no easy way to compress the miliseconds, I
mean, no easy way to truncate the days and only get the month count since 1970. The only good
solution I found was to get the number of years since 1970 and multiply by 12 plus the current
month number.

I am still wondering if we can always assume that dividing the miliseconds by 1000 (sec),
60 (minute), 60 (hour) and 24 (day) will actually be precise. What about the leap second?
Not sure if the miliseconds -> (defined_resolution) and (defined_resolution) -> miliseconds
will always be correct. Maybe I am missing something or overcomplicating.

> NumericRange support for new query parser
> -----------------------------------------
>
>                 Key: LUCENE-1768
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1768
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/queryparser
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>              Labels: contrib, gsoc, gsoc2011, lucene-gsoc-11, mentor
>             Fix For: 4.0
>
>         Attachments: TestNumericQueryParser-fix.patch, TestNumericQueryParser-fix.patch,
TestNumericQueryParser-fix.patch, TestNumericQueryParser-fix.patch, week-7.patch, week-8.patch,
week1.patch, week2.patch, week3.patch, week4.patch, week5-6.patch
>
>
> It would be good to specify some type of "schema" for the query parser in future, to
automatically create NumericRangeQuery for different numeric types? It would then be possible
to index a numeric value (double,float,long,int) using NumericField and then the query parser
knows, which type of field this is and so it correctly creates a NumericRangeQuery for strings
like "[1.567..*]" or "(1.787..19.5]".
> There is currently no way to extract if a field is numeric from the index, so the user
will have to configure the FieldConfig objects in the ConfigHandler. But if this is done,
it will not be that difficult to implement the rest.
> The only difference between the current handling of RangeQuery is then the instantiation
of the correct Query type and conversion of the entered numeric values (simple Number.valueOf(...)
cast of the user entered numbers). Evenerything else is identical, NumericRangeQuery also
supports the MTQ rewrite modes (as it is a MTQ).
> Another thing is a change in Date semantics. There are some strange flags in the current
parser that tells it how to handle dates.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message