lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-1768) NumericRange support for new query parser
Date Thu, 14 Jul 2011 18:10:00 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-1768:
----------------------------------

    Attachment: TestNumericQueryParser-fix.patch

Hi Vinicius,

I tested your patch (after converting it to new trunk directory/package layout (thanks Chris
Male :-)

It still fails quite often for some locales (at least in Java 6). The problem is that some
locales produce date formats that are not immune to case changes. As QueryParser seems to
lowercase the range bounds, some dates cannot be parsed.

This throwed with your patch a NPE, because you implemented NumberDateFormat wrong: public
Number parse(String source, ParsePosition parsePosition) is allowed to return null (it must
if a parse error occurs). The same applies to date formats, but if you call getTime() on a
null Date it throws NPE. So the attached patch also fixes the NumberDateFormat to handle null
correctly.

I also changed the test initialization a bit to produce sane dates from the beginning.

I then added a toLowerCase(LOCALE) to the sanity checker

Now the static initializer fails for:
{noformat}
ant test -Dtestcase=TestNumericQueryParser -Dtestmethod=testInclusiveNumericRange -Dtests.seed=5825000776503943381:-1057095952794658416
{noformat}

As a lowercased date cannot be parsed, this fails with ParseException. The locale is "es",
so the spanisch translation of "GMT" is case sensitive:

This parses:
{noformat}
domingo 19 de agosto de 1973 11H31' GAMT AD 34 -0900 1973
{noformat}

This not:
{noformat}
domingo 19 de agosto de 1973 11h31' gamt ad 34 -0900 1973
{noformat}

> NumericRange support for new query parser
> -----------------------------------------
>
>                 Key: LUCENE-1768
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1768
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/queryparser
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>              Labels: contrib, gsoc, gsoc2011, lucene-gsoc-11, mentor
>             Fix For: 4.0
>
>         Attachments: TestNumericQueryParser-fix.patch, TestNumericQueryParser-fix.patch,
week-7.patch, week1.patch, week2.patch, week3.patch, week4.patch, week5-6.patch
>
>
> It would be good to specify some type of "schema" for the query parser in future, to
automatically create NumericRangeQuery for different numeric types? It would then be possible
to index a numeric value (double,float,long,int) using NumericField and then the query parser
knows, which type of field this is and so it correctly creates a NumericRangeQuery for strings
like "[1.567..*]" or "(1.787..19.5]".
> There is currently no way to extract if a field is numeric from the index, so the user
will have to configure the FieldConfig objects in the ConfigHandler. But if this is done,
it will not be that difficult to implement the rest.
> The only difference between the current handling of RangeQuery is then the instantiation
of the correct Query type and conversion of the entered numeric values (simple Number.valueOf(...)
cast of the user entered numbers). Evenerything else is identical, NumericRangeQuery also
supports the MTQ rewrite modes (as it is a MTQ).
> Another thing is a change in Date semantics. There are some strange flags in the current
parser that tells it how to handle dates.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message