lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Ganyo <scott.ga...@eTapestry.com>
Subject RE: parsing range queries: should they be analyzed?
Date Mon, 21 Jan 2002 17:53:42 GMT
Ok with me.

Scott

> -----Original Message-----
> From: Doug Cutting [mailto:DCutting@grandcentral.com]
> Sent: Monday, January 21, 2002 12:41 PM
> To: 'lucene-dev@jakarta.apache.org'
> Subject: parsing range queries: should they be analyzed?
> 
> 
> (I have some time to work on Lucene this week, so I'm going 
> through old
> changes that I made and never had the time to commit or discuss.)
> 
> It seems to me that the terms in a range query should not be 
> analyzed.  If
> one wishes to, e.g., look for terms starting with "intens", 
> one does not
> want this to be stemmed to "inten" and have "intend" match.
> 
> The diff for this change is below.
> 
> Do folks agree, or are there reasons that range query terms must be
> analyzed?
> 
> Doug
> 
> 
> 
> Index: QueryParser.jj
> ===================================================================
> RCS file:
> /home/cvs/jakarta-lucene/src/java/org/apache/lucene/queryParse
> r/QueryParser.
> jj,v
> retrieving revision 1.9
> diff -u -w -u -w -r1.9 QueryParser.jj
> --- QueryParser.jj	17 Jan 2002 02:49:22 -0000	1.9
> +++ QueryParser.jj	21 Jan 2002 17:34:47 -0000
> @@ -204,39 +204,6 @@
>      }
>    }
>  
> -  private Query getRangeQuery(String field, 
> -                              Analyzer analyzer, 
> -                              String queryText, 
> -                              boolean inclusive) 
> -  {
> -    // Use the analyzer to get all the tokens.  There should 
> be 1 or 2.
> -    TokenStream source = analyzer.tokenStream(field, 
> -                                              new 
> StringReader(queryText));
> -    Term[] terms = new Term[2];
> -    org.apache.lucene.analysis.Token t;
> -
> -    for (int i = 0; i < 2; i++)
> -    {
> -      try 
> -      {
> -        t = source.next();
> -      } 
> -      catch (IOException e) 
> -      {
> -        t = null;
> -      }
> -      if (t != null)
> -      {
> -        String text = t.termText();
> -        if (!text.equalsIgnoreCase("NULL"))
> -        {
> -          terms[i] = new Term(field, text);
> -        }
> -      }
> -    }
> -    return new RangeQuery(terms[0], terms[1], inclusive);
> -  }
> -
>    public static void main(String[] args) throws Exception {
>      QueryParser qp = new QueryParser("field", 
>                             new
> org.apache.lucene.analysis.SimpleAnalyzer());
> @@ -287,8 +254,10 @@
>  | <PREFIXTERM:  <_TERM_START_CHAR> (<_TERM_CHAR>)* "*" >
>  | <WILDTERM:  <_TERM_START_CHAR> 
>                (<_TERM_CHAR> | ( [ "*", "?" ] ))* >
> -| <RANGEIN:   "[" ( ~[ "]" ] )+ "]">
> -| <RANGEEX:   "{" ( ~[ "}" ] )+ "}">
> +| <RANGE_IN_OPEN:   "[" >
> +| <RANGE_IN_CLOSE:  "]" >
> +| <RANGE_EX_OPEN:   "{" >
> +| <RANGE_EX_CLOSE:  "}" >
>  }
>  
>  <Boost> TOKEN : {
> @@ -371,7 +340,7 @@
>      
>  
>  Query Term(String field) : { 
> -  Token term, boost=null;
> +  Token term, boost=null, term2=null;
>    boolean prefix = false;
>    boolean wildcard = false;
>    boolean fuzzy = false;
> @@ -399,11 +368,20 @@
>         else
>           q = getFieldQuery(field, analyzer, term.image); 
>       }
> -     | ( term=<RANGEIN> { rangein=true; } | term=<RANGEEX> )
> -       [ <CARAT> boost=<NUMBER> ]
> +     | (
> +        (<RANGE_IN_OPEN> { rangein=true; }
> +         term=<TERM> 
> +         (<MINUS> term2=<TERM>)
> +         <RANGE_IN_CLOSE>)
> +        |
> +        (<RANGE_EX_OPEN>
> +         term=<TERM> 
> +         (<MINUS> term2=<TERM>)
> +        <RANGE_EX_CLOSE> )
> +       )
>          {
> -          q = getRangeQuery(field, analyzer, 
> -                            term.image.substring(1, 
> term.image.length()-1),
> 
> +       q = new RangeQuery(new Term(field, term.image),
> +                          new Term(field, term2.image),
>                              rangein);
>          }
>       | term=<QUOTED> 
> 
> --
> To unsubscribe, e-mail:   
<mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message