lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sergiu gordea <gser...@ifit.uni-klu.ac.at>
Subject Re: suffix queries in lucene ....
Date Fri, 11 Feb 2005 16:41:47 GMT
Hi all,

  I would like to contribute to lucene with the update of QueryParser so
that it will allow the SuffixQueries.
I patched my lucene to allow *term in search strings, they are valid 
WildTerm.

I've seen the discussions on this thema and I understand why lucene 
developers don't to support suffix queries, but on our project is one of 
the requirements to support suffix queries, because they are very 
usefull for german language.
I would suggest to have the posibility to turn on/of this functionality.

My problem is that I am not familiar with JavaCC, can anyone help me to 
implement this?

  Thanks in advance,

   Sergiu


> Sergiu,
>
> I'm swamped for time myself as well as inexperienced with JavaCC other 
> than minor tweaks.  So your best bet is to ask questions on the 
> lucene-user or -dev lists.
>
> I suspect there is a more JavaCC-centric way to allow the grammar to 
> be more dynamic than your suggestion.
>
> Thanks for tackling this.
>
>     Erik
>
>
> On Feb 9, 2005, at 12:55 PM, sergiu gordea wrote:
>
>> Hi Erik,
>>
>>  I was proposing to update QueryParser to allow the construction of 
>> suffix queries.
>> I took a look in QueryParser.jj and the QueryParser.java and I 
>> figured out how to implement this functionality.
>>
>> I want to add the field
>> boolean allowSuffixQueries = false;
>> and getter, setter methods.
>>
>> The definition of the WILDTERM in QueryParser.jj must be changes from:
>> <WILDTERM:  (<_TERM_CHAR> | ( [ "*", "?" ] ))* >
>> to:
>> <WILDTERM:  (<_TERM_START_CHAR> (<_TERM_CHAR> | ( [ "*", "?" ] ))*
)
>>                    | ( [ "*", "?" ] <_TERM_START_CHAR> (<_TERM_CHAR>

>> | ( [ "*", "?" ] ) )* ) >
>>
>> ( Maybe this is not the best definition, but we are using it since 6 
>> months and we haven't had any problem with it)
>>
>> This definition will allow suffix queries. In order to prevent them I 
>> would suggest to update the Clause method in the following way:
>>
>> final public Query Clause(String field) throws ParseException {
>>  Query q;
>>  Token fieldToken=null, boost=null;
>>    if (jj_2_1(2)) {
>>      fieldToken = jj_consume_token(TERM);
>>      jj_consume_token(COLON);
>>      field=discardEscapeChar(fieldToken.image);
>>    } else {
>>      ;
>>    }
>>    switch ((jj_ntk==-1)?jj_ntk():jj_ntk) {
>>    case QUOTED:
>>    case TERM:
>>    case PREFIXTERM:
>>    case WILDTERM:
>>          if(!allowSuffixQuery && isSuffixQuery(fieldToken.image))
>>                throw new ParseException("suffix queries are not 
>> allowed! You must set allowSuffixQueries to true!");
>>    case RANGEIN_START:
>>    case RANGEEX_START:
>>    case NUMBER:
>>      q = Term(field);
>>      break;
>>    case LPAREN:
>>      jj_consume_token(LPAREN);
>>      q = Query(field);
>>      jj_consume_token(RPAREN);
>>      switch ((jj_ntk==-1)?jj_ntk():jj_ntk) {
>>      case CARAT:
>>        jj_consume_token(CARAT);
>>        boost = jj_consume_token(NUMBER);
>>        break;
>>      default:
>>        jj_la1[5] = jj_gen;
>>        ;
>>      }
>>      break;
>>    default:
>>      jj_la1[6] = jj_gen;
>>      jj_consume_token(-1);
>>      throw new ParseException();
>>    }
>>      if (boost != null) {
>>        float f = (float)1.0;
>>  try {
>>    f = Float.valueOf(boost.image).floatValue();
>>          q.setBoost(f);
>>  } catch (Exception ignored) { }
>>      }
>>      {if (true) return q;}
>>    throw new Error("Missing return statement in function");
>>  }
>>
>>  boolean isSuffixQuery(String s){
>>       return s.startsWith("*") ||  s.startsWith("?");
>>  }
>>  I didn't found "*" and "?" to be defined as constants, they should 
>> be replaced.
>>
>> As I am not very familiar with JavaCC I don't know how to apply this 
>> changes in the QueryParser.jj file.
>> Can you help me a little bit to apply these changes in the code please?
>> I will create then some Junit tests to check the behaviour...
>>
>>
>>  Thanks in advance,
>>
>>   Sergiu
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message