lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Hadianto <vict...@nuix.com.au>
Subject Re: HELP in QueyParsing !!
Date Mon, 14 Jul 2003 08:32:23 GMT
> Input:   QueryCreated     Remarks
> c\+\+      c           (Escape character not working)

The StandardTokenizer and QueryParser will drop the ++ sign. This problem is 
similar to the recent thread. Search the archive the the following strings
'-' characer not interpreted correctly in field names

You may be able to implement similar solution to the one that I've posted. 

Actually your query got me interested, I've tried my solution for c-- and the 
-- signs are dropped. This because I define DASHESWORD as 

| <DASHESWORD: <ALPHANUM> ("-" <ALPHANUM>)+ >

This will search for t-shirt, but not tshirt-. Yet another QueryParser 
peculiarity :)

If you absolutely has to search for c++ then I suggest you define another 
token which encompasses all alpharnumeric word and plus sign. For example 
(modify StandardTokenizer.jj):

<MYTOKEN: (<ALPHANUM>|"+")+ >

add the line:

token = <MYTOKEN>

in the next() method. This may work.

> c++        -           (Parser throws an exception) [NOTE-1]
As expected.

> *c         -           (throws an exception -   [NOTE-2]
There has been a number of discussion on this subject, search the mailing list 
for more information. 

> Does that mean that the program should taken care of validating the
> User input and then pass the query string to QueryParser?

Depends how do you look at it. QueryParser will throw ParseException if it has 
parsing issues, you can in some way treat this as the validation.


HTH,
victor


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message