lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: negation search help
Date Wed, 23 Nov 2016 21:46:48 GMT
You've gotten far better answers on this already, but you can use the SpanNotQuery in the SpanQueryParser
I maintain and have published to maven central [1][2][3].

This does not carry out any nlp, but this would allow literal "headache (no not)"!~5,0 ->
"headache" but not if "no" or "not" shows up within 5 words before. 

[1] https://github.com/tballison/lucene-addons/tree/master/lucene-5205
[2] https://github.com/tballison/lucene-addons/tree/master/solr-5410 
[3] http://search.maven.org/#artifactdetails%7Corg.tallison.lucene%7Clucene-addons%7C6.3-0.1%7Cpom



-----Original Message-----
From: Alexandre Rafalovitch [mailto:arafalov@gmail.com] 
Sent: Wednesday, November 23, 2016 10:03 AM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: negation search help

Well, then 'no' becomes a signal token. So, the question is how many tokens after that it
affects in its circle of negation?

You could probably use something like
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-SurroundQueryParser
to say (if user said 'headache').
-{!surround} 3w(not, headache)

But I am not sure how this would work in terms of multi-term queries.

Alternatively, you could transform your input with custom token filter that, after seeing
the term 'no', 'not', will just eat that and next n? tokens.

Or you could run the sentences through natural language recognition and remove/mark noun phrases
that are negative.

What I am trying to say is that Solr can do a bunch of different things for you. But you first
need to translate your domain problem into a much lower level pseudo-language problem that
addresses your needs. Including the edge-cases, which none of us can guess from your description.
Then you can implement it in Solr.

Hope this helps,
   Alex.

----
http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 24 November 2016 at 01:43, Hem Naidu
<hem.naidu@teschglobal.com.invalid> wrote:
> Correct Alex. The use case is when provider searches on patient medical information for
certain symptoms, the mentions likes "no headache" , "no blood loss", "not diabetic" should
not show up in the search results.
>
> Thanks
>
>
> -----Original Message-----
> From: Alessandro Benedetti [mailto:benedetti.alex85@gmail.com]
> Sent: Wednesday, November 23, 2016 8:22 AM
> To: solr-user@lucene.apache.org
> Subject: Re: negation search help
>
> Now that I read better, do you mean that at indexing time those negations must be recognized,
in the way that they are no match ?
>
> Cheers
>
> On Wed, Nov 23, 2016 at 2:20 PM, Alessandro Benedetti < benedetti.alex85@gmail.com>
wrote:
>
>> Hi Hem,
>> are you expecting Solr to parse your natural language query out of 
>> the box ?
>> Are you using any custom query parser ?
>>
>> If not, you need to follow the lucene Syntax to define engative queries.
>>
>> And be careful to the edge cases [1] .
>>
>> Cheers
>>
>> [1] https://wiki.apache.org/solr/NegativeQueryProblems
>>
>> On Wed, Nov 23, 2016 at 1:54 PM, Hem Naidu <hem.naidu@teschglobal.com.
>> invalid> wrote:
>>
>>> Alex
>>>
>>> Whenever the keywords or sentence followed by "no", "not", etc 
>>> should be excluded from the search results. Does solr support this feature?
>>>
>>> Thanks
>>>
>>>
>>> Sent from my iPhone
>>>
>>>
>>> > On Nov 23, 2016, at 12:09 AM, Alexandre Rafalovitch 
>>> > <arafalov@gmail.com>
>>> wrote:
>>> >
>>> > How do you _know_ it is not 'apparent' ? Is it because it is 
>>> > preceded by the keyword 'no'? Just that keyword? At what maximum distance?
>>> >
>>> > Regards,
>>> >   Alex
>>> >
>>> > On 23 Nov 2016 2:59 PM, "Hem Naidu"
>>> > <hem.naidu@teschglobal.com.invalid>
>>> > wrote:
>>> >
>>> >> Gurus,
>>> >>
>>> >> I am new to Solr, I have a requirement to index entire pdf/word
>>> documents
>>> >> using Solr Tika. Which was successful and able to get the search
>>> results
>>> >> displayed. Now I need to fine tune the results or adjust index so 
>>> >> the negative statements should be filtered out the results like 
>>> >> my input
>>> text
>>> >> for index from the documents would be
>>> >> -----------------------------------
>>> >> Fortunately no concurrent trauma was found In no apparent 
>>> >> distress
>>> >> --------------------------------------
>>> >>
>>> >> If user searches for concurrent trauma or distress the search 
>>> >> engine
>>> should
>>> >> filter out the results as it not apparent symptom.
>>> >>
>>> >> Any help on whether Solr can do this?
>>> >> If so, do I need to adjust the index or build custom queries?
>>> >>
>>> >> Any help on this would be greatly appreciated !
>>> >>
>>> >> Thanks
>>> >>
>>> >>
>>> >>
>>>
>>
>>
>>
>> --
>> --------------------------
>>
>> Benedetti Alessandro
>> Visiting card - http://about.me/alessandro_benedetti
>> Blog - http://alexbenedetti.blogspot.co.uk
>>
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>>
>> William Blake - Songs of Experience -1794 England
>>
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card - http://about.me/alessandro_benedetti
> Blog - http://alexbenedetti.blogspot.co.uk
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>
Mime
View raw message