lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alireza Salimi <alireza.sal...@gmail.com>
Subject Re: Synonyms and hyphens
Date Wed, 04 Jul 2012 17:56:23 GMT
ok, so how can I prevent this behavior to happen?
As you can see the parsed query is very different in these two cases.

On Wed, Jul 4, 2012 at 1:37 PM, Jack Krupansky <jack@basetechnology.com>wrote:

> There is one other detail that should clarify the situation. At query
> time, the query parser itself is breaking your query into space-delimited
> terms, and only calling the analyzer for each of those terms, each of which
> will be treated as if a quoted phrase. So it doesn't matter whether it is
> the standard analyzer or word delimiter filter or other filter that is
> breaking up the compound term.
>
> And the default "query operator" only applies to the "terms" as the query
> parser parsed them, not for the sub-terms of a compound term like CD-ROM or
> gb-mb.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Alireza Salimi
> Sent: Wednesday, July 04, 2012 12:05 PM
>
> To: solr-user@lucene.apache.org
> Subject: Re: Synonyms and hyphens
>
> Wow, I didn't know that. Is there a way to disable this feature? I mean, is
> it something coming from the Analyzer?
>
> On Wed, Jul 4, 2012 at 12:26 PM, Jack Krupansky <jack@basetechnology.com>*
> *wrote:
>
>  Terms with embedded special characters are treated as phrases with spaces
>> in place of the special characters. So, "gb-mb" is treated as if you had
>> enclosed the term in quotes.
>>
>> -- Jack Krupansky
>> -----Original Message----- From: Alireza Salimi
>> Sent: Wednesday, July 04, 2012 6:50 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Synonyms and hyphens
>>
>>
>> Hi,
>>
>> Does anybody know why hyphen '-' and q.op=AND causes such a big difference
>> between the two queries? I thought hyphens are removed by
>> StandardTokenizer
>> which means theoretically the two queries should be the same!
>>
>> Thanks
>>
>> On Tue, Jul 3, 2012 at 4:05 PM, Alireza Salimi <alireza.salimi@gmail.com
>> >*
>> *wrote:
>>
>>  Hi,
>>
>>>
>>> I'm not sure if anybody has experienced this behavior before or not.
>>> I noticed that 'hyphen' plays a very important role here.
>>> I used Solr's default example directory.
>>>
>>> http://localhost:8983/solr/****select/?q=name:(gb-mb)&**<http://localhost:8983/solr/**select/?q=name:(gb-mb)&**>
>>> version=2.2&start=0&rows=10&****indent=on&debugQuery=on&**
>>> indent=on&wt=json&q.op=AND<htt**p://localhost:8983/solr/**
>>> select/?q=name:(gb-mb)&**version=2.2&start=0&rows=10&**
>>> indent=on&debugQuery=on&**indent=on&wt=json&q.op=AND<http://localhost:8983/solr/select/?q=name:(gb-mb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND>
>>> >
>>>
>>> results in  "parsedquery":"+name:gb +name:gib +name:gigabyte
>>> +name:gigabytes +name:mb +name:mib +name:megabyte +name:megabytes",
>>>
>>> While searching http://localhost:8984/solr/**
>>> select/?q=name:(gbmb)&version=****2.2&start=0&rows=10&indent=**on&**
>>> debugQuery=on&indent=on&wt=****json&q.op=AND<http://**
>>> localhost:8984/solr/select/?q=**name:(gbmb)&version=2.2&start=**
>>> 0&rows=10&indent=on&**debugQuery=on&indent=on&wt=**json&q.op=AND<http://localhost:8984/solr/select/?q=name:(gbmb)&version=2.2&start=0&rows=10&indent=on&debugQuery=on&indent=on&wt=json&q.op=AND>
>>> >
>>>
>>> results in "parsedquery":"+(name:gb name:gib name:gigabyte
>>> name:gigabytes) +(name:mb name:mib name:megabyte name:megabytes)",
>>>
>>> If you notice to the first query - with hyphens - you can see that the
>>> results of
>>> parsing is totally different. I know that hyphens are special characters
>>> in Solr,
>>> but there's no way that the first query returns any entry because it's
>>> asking for
>>> ALL synonyms.
>>>
>>> Am I missing something here?
>>>
>>> Thanks
>>>
>>>
>>> --
>>> Alireza Salimi
>>> Java EE Developer
>>>
>>>
>>>
>>>
>>>
>> --
>> Alireza Salimi
>> Java EE Developer
>>
>>
>
>
> --
> Alireza Salimi
> Java EE Developer
>



-- 
Alireza Salimi
Java EE Developer

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message