lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: how to remove the dash
Date Tue, 26 Jun 2012 03:14:15 GMT
Most query parsers will "parse" a leading hyphen as an operator, so it will 
never get to the analyzer for any field. Whether white space is permitted 
between the "-" operator and the following term is dependent on the specific 
query parser, and not guaranteed.

So, "bebidas - agua" is parsed by the query parser the same as 
"bebidas -agua", which is the "prohibit" operator. This is all as it should 
be.

Generally, all operators, including "+", "-", parentheses, "AND", "OR, etc. 
need to be escaped if you want them to be passed through to the field 
analyzers. Operators embedded within terms do not need to be escaped, except 
for parentheses.

So, if you want user input to be treated as raw English text, as opposed to 
a "structured" query, be sure to filter or escape the user query text before 
parsing it. Or, consider using a simple term query that does no query 
"parsing", but does pass the term through the field analyzer for the desired 
field type.

-- Jack Krupansky

-----Original Message----- 
From: listas@alphamatrix.org
Sent: Monday, June 25, 2012 4:12 PM
To: java-user@lucene.apache.org
Subject: Re: how to remove the dash

More information...
If I change
System.out.println("Query: " + query.toString("contents"));
to this:
System.out.println("Query: " + query.toString());
I get this result:
"Query: contents:bebidas -contents:agua"

As I already tried many diferent Analyzers and I always get the same
result maybe it's a problem on the query parser??


A Segunda, 25 de Junho de 2012 21:10:02 listas@alphamatrix.org
escreveu:
> You are right... i'am not geting the hyphen inside any token... but it
still
> used as "prohibit operator".
>
> This is my output:
> Test: bebidas - agua
> Query: bebidas -agua
> Tokens:
> 1: [bebidas:0->7:<ALPHANUM>]
> 2: [agua:10->14:<ALPHANUM>]
>
> Test is the original string.
> Thanks

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message