lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuming Chen <chumingc...@gmail.com>
Subject Re: Solr NGram Phrase Query
Date Thu, 16 Nov 2017 15:40:01 GMT
Hi Shawn,

I think the position is the issue, but how do I fix it? Is something wrong with my index analyzer
or just my query is not right? I need to do phrase query, order is important here. 

I tried “KKS KSA”~1 in the query, it worked. However, if I do "KKS KSA SAR”~1, it didn’t
work, I had to do "KKS KSA SAR”~2. 

Is phrase slop essential here. I used to with Solr 3.5, no phrase slop is needed.

Thanks,

Chuming



On Nov 16, 2017, at 10:13 AM, Shawn Heisey <apache@elyograg.org> wrote:

> On 11/16/2017 7:38 AM, Chuming Chen wrote:
>> Referencing the first image in the message, showing the analysis tab.  This reply
is plain text, so that image cannot be included.
> 
> In your query, you have two terms as a phrase - kks and ksa.  These match terms in the
index, but the reason that the *query* doesn't match is that the relative *positions* don't
match.  In your query, the terms are at position 1 and 2, but in the *index*, all the terms
are at position 1.  Because the query has quotes, it is a phrase query, which means that positions
matter.  With the query terms at position 1 and position 2, the indexed terms would have to
be at say position 5 and position 6 -- next to each other and in that specific order -- in
order to have a match.
> 
> If you sent "KKS KSA"~1 instead, the query would have a phrase slop of 1, which would
mean that the relative positions can differ by one and still match.  Or if you were to remove
the quotes so that it were not a phrase query, it might match.  All of this of course depends
on what your default field is.
> 
> Also, note that the query analysis page does not know that quotes are special -- the
query parser is not used.  Running the analysis with the quotes happens to work out correctly
in this particular case because the standard tokenizer in this field type removes punctuation,
but on a less aggressive analysis chain, the quotes that are counted as special by the query
parser (and therefore are not even sent to analysis) might actually be included in the terms
on the analysis page.
> 
> Thanks,
> Shawn
> 


Mime
View raw message