lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Synonyms and hyphens
Date Wed, 04 Jul 2012 17:37:12 GMT
There is one other detail that should clarify the situation. At query time, 
the query parser itself is breaking your query into space-delimited terms, 
and only calling the analyzer for each of those terms, each of which will be 
treated as if a quoted phrase. So it doesn't matter whether it is the 
standard analyzer or word delimiter filter or other filter that is breaking 
up the compound term.

And the default "query operator" only applies to the "terms" as the query 
parser parsed them, not for the sub-terms of a compound term like CD-ROM or 
gb-mb.

-- Jack Krupansky

-----Original Message----- 
From: Alireza Salimi
Sent: Wednesday, July 04, 2012 12:05 PM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms and hyphens

Wow, I didn't know that. Is there a way to disable this feature? I mean, is
it something coming from the Analyzer?

On Wed, Jul 4, 2012 at 12:26 PM, Jack Krupansky 
<jack@basetechnology.com>wrote:

> Terms with embedded special characters are treated as phrases with spaces
> in place of the special characters. So, "gb-mb" is treated as if you had
> enclosed the term in quotes.
>
> -- Jack Krupansky
> -----Original Message----- From: Alireza Salimi
> Sent: Wednesday, July 04, 2012 6:50 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Synonyms and hyphens
>
>
> Hi,
>
> Does anybody know why hyphen '-' and q.op=AND causes such a big difference
> between the two queries? I thought hyphens are removed by 
> StandardTokenizer
> which means theoretically the two queries should be the same!
>
> Thanks
>
> On Tue, Jul 3, 2012 at 4:05 PM, Alireza Salimi <alireza.salimi@gmail.com>*
> *wrote:
>
>  Hi,
>>
>> I'm not sure if anybody has experienced this behavior before or not.
>> I noticed that 'hyphen' plays a very important role here.
>> I used Solr's default example directory.
>>
>> http://localhost:8983/solr/**select/?q=name:(gb-mb)&**
>> version=2.2&start=0&rows=10&**indent=on&debugQuery=on&**
>> indent=on&wt=json&q.op=AND<http://localhost:8983/solr/select/?q=name:(gb-mb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND>
>> results in  "parsedquery":"+name:gb +name:gib +name:gigabyte
>> +name:gigabytes +name:mb +name:mib +name:megabyte +name:megabytes",
>>
>> While searching http://localhost:8984/solr/**
>> select/?q=name:(gbmb)&version=**2.2&start=0&rows=10&indent=on&**
>> debugQuery=on&indent=on&wt=**json&q.op=AND<http://localhost:8984/solr/select/?q=name:(gbmb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND>
>> results in "parsedquery":"+(name:gb name:gib name:gigabyte
>> name:gigabytes) +(name:mb name:mib name:megabyte name:megabytes)",
>>
>> If you notice to the first query - with hyphens - you can see that the
>> results of
>> parsing is totally different. I know that hyphens are special characters
>> in Solr,
>> but there's no way that the first query returns any entry because it's
>> asking for
>> ALL synonyms.
>>
>> Am I missing something here?
>>
>> Thanks
>>
>>
>> --
>> Alireza Salimi
>> Java EE Developer
>>
>>
>>
>>
>
> --
> Alireza Salimi
> Java EE Developer
>



-- 
Alireza Salimi
Java EE Developer 


Mime
View raw message