lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <>
Subject Re: how to remove the dash
Date Tue, 26 Jun 2012 03:14:15 GMT
Most query parsers will "parse" a leading hyphen as an operator, so it will 
never get to the analyzer for any field. Whether white space is permitted 
between the "-" operator and the following term is dependent on the specific 
query parser, and not guaranteed.

So, "bebidas - agua" is parsed by the query parser the same as 
"bebidas -agua", which is the "prohibit" operator. This is all as it should 

Generally, all operators, including "+", "-", parentheses, "AND", "OR, etc. 
need to be escaped if you want them to be passed through to the field 
analyzers. Operators embedded within terms do not need to be escaped, except 
for parentheses.

So, if you want user input to be treated as raw English text, as opposed to 
a "structured" query, be sure to filter or escape the user query text before 
parsing it. Or, consider using a simple term query that does no query 
"parsing", but does pass the term through the field analyzer for the desired 
field type.

-- Jack Krupansky

-----Original Message----- 
Sent: Monday, June 25, 2012 4:12 PM
Subject: Re: how to remove the dash

More information...
If I change
System.out.println("Query: " + query.toString("contents"));
to this:
System.out.println("Query: " + query.toString());
I get this result:
"Query: contents:bebidas -contents:agua"

As I already tried many diferent Analyzers and I always get the same
result maybe it's a problem on the query parser??

A Segunda, 25 de Junho de 2012 21:10:02
> You are right... i'am not geting the hyphen inside any token... but it
> used as "prohibit operator".
> This is my output:
> Test: bebidas - agua
> Query: bebidas -agua
> Tokens:
> 1: [bebidas:0->7:<ALPHANUM>]
> 2: [agua:10->14:<ALPHANUM>]
> Test is the original string.
> Thanks

To unsubscribe, e-mail:
For additional commands, e-mail: 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message