lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: different fields for user-supplied phrases in edismax
Date Sat, 13 Dec 2014 19:48:06 GMT
I mentioned the ticket I opened earlier in the thread; it's 
https://issues.apache.org/jira/browse/SOLR-6842.

My thought was to provide a new parameter, pqf, which would be just like 
qf, but for phrases.  If it's not present, phrases work just like before 
(search field in qf).  I wasn't proposing to change the behaviour of the 
phrase-like queries generated by pf.

-Mike

On 12/13/14 11:02 AM, Jack Krupansky wrote:
> Sounds worthy of a Jira.
>
> One gotcha - how to support both use cases of more precision for 
> quoted phrases as well as the same precision as unquoted. The latter 
> is needed for edismax phrase boosting, although it might be 
> interesting to support both, so more precise phrases get an even 
> higher boost as do less-precise phrases. But it does need to be 
> optional since it has an added cost at query time.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Michael Sokolov
> Sent: Saturday, December 13, 2014 8:43 AM
> To: solr-user@lucene.apache.org
> Subject: Re: different fields for user-supplied phrases in edismax
>
> I want terms to be stemmed, unless they are quoted, using dismax.
>
>
> On 12/12/14 8:19 PM, Amit Jha wrote:
>> Hi Mike,
>>
>> What is exact your use case?
>> What do mean by "controlling the fields used for phrase queries" ?
>>
>>
>> Rgds
>> AJ
>>
>>> On 12-Dec-2014, at 20:11, Michael Sokolov 
>>> <msokolov@safaribooksonline.com> wrote:
>>>
>>> Doug - I believe pf controls the fields that are used for the phrase 
>>> queries *generated by the parser*.
>>>
>>> What I am after is controlling the fields used for the phrase 
>>> queries *supplied by the user* -- ie surrounded by double-quotes.
>>>
>>> -Mike
>>>
>>>> On 12/12/2014 08:53 AM, Doug Turnbull wrote:
>>>> Michael,
>>>>
>>>> I typically solve this problem by using a copyField and running 
>>>> different
>>>> analysis on the destination field. Then you could use this field as pf
>>>> insteaf of qf. If I recall, fields in pf must also be mentioned in 
>>>> qf for
>>>> this to work.
>>>>
>>>> -Doug
>>>>
>>>> On Fri, Dec 12, 2014 at 8:13 AM, Michael Sokolov <
>>>> msokolov@safaribooksonline.com> wrote:
>>>>> Yes, I guess it's a common expectation that searches work this 
>>>>> way.  It
>>>>> was actually almost trivial to add as an extension to the edismax 
>>>>> parser,
>>>>> and I have what I need now; I opened SOLR-6842; if there's 
>>>>> interest I'll
>>>>> try to find the time to contribute back to Solr
>>>>>
>>>>> -Mike
>>>>>
>>>>>
>>>>>> On 12/11/14 5:20 PM, Ahmet Arslan wrote:
>>>>>>
>>>>>> Hi Mike,
>>>>>>
>>>>>> If I am not wrong, you are trying to simulate google behaviour.
>>>>>> If you use quotes, google return exact matches. I think that makes
>>>>>> perfectly sense and will be a valuable addition. I remember some

>>>>>> folks
>>>>>> asked/requested this behaviour in the list.
>>>>>>
>>>>>> Ahmet
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thursday, December 11, 2014 10:50 PM, Michael Sokolov <
>>>>>> msokolov@safaribooksonline.com> wrote:
>>>>>> I'd like to supply a different set of fields for phrases than for

>>>>>> bare
>>>>>> terms.  Specifically, we'd like to treat phrases as more "exact"
-
>>>>>> probably turning off stemming and generally having a tighter 
>>>>>> analysis
>>>>>> chain.  Note: this is *not* what's done by configuring "pf" which
>>>>>> controls fields for the auto-generated phrases.  What we want to

>>>>>> do is
>>>>>> provide our users more precise control by explicit use of " "
>>>>>>
>>>>>> Is there a way to do this by configuring edismax?  I don't think

>>>>>> there
>>>>>> is, and then if you agree, a followup question - if I want to 
>>>>>> extend the
>>>>>> EDismax parser, does anybody have advice as to the best way in? 
I'm
>>>>>> looking at:
>>>>>>
>>>>>> Query getFieldQuery(String field, String val, int slop)
>>>>>>
>>>>>> and altering getAliasedQuery() to accept an aliases parameter, which
>>>>>> would be a different set of aliases for phrases ...
>>>>>>
>>>>>> does that make sense?
>>>>>>
>>>>>> -Mike 
>


Mime
View raw message