lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Doupnik <...@netlab1.net>
Subject Re: query bag of word with negation
Date Sun, 22 Apr 2018 18:30:24 GMT
On 22/04/2018 19:26, Joe Doupnik wrote:
> On 22/04/2018 19:04, Nicolas Paris wrote:
>> Hello
>>
>> I wonder if there is a plain text query syntax to say:
>> give me all document that match:
>>
>> wonderful pizza NOT peperoni
>>
>> all those in a 5 distance word bag
>> then
>>
>> pizza are wonderful -> would match
>> I made a wonderful pasta and pizza -> would match
>> Peperoni pizza are so wonderful -> would not match
>>
>> I tested:
>> "wonderful pizza - peperoni"~5
>> without success
>>
>> Thanks
>>
> ---------------
>     A partial answer to your question is contained in this Help screen 
> text from my Solr query program:
>
> Some hints about using this facility: 1. Query terms containing other 
> than just letters or digits may be placed within double quotes so that 
>  those other characters do not separate a term into many terms. A dot 
> (period) and white space are neither  letter nor digit. Examples: "Now 
> is the time for all good men" (spaces, quotes impose ordering too), 
> "goods.doc" (a dot). 2. Mode button "or" (the default) means match one 
> or more terms, perhaps scattered about. Mode button "and" means must 
> match all terms, scattered or not. 3. A one word query term may be 
> prefixed by title: or url: to search on those fields. A space must 
> follow the colon, and the search term is case sensitive. Examples: 
> url: .ppt or title: Goodies. Many docs do not have a formal internal 
> title field, thus prefix title: may not work. 4. Compound queries can 
> be built by joining terms with and or - and group items with ( ). Not 
> is expressed as a minus sign prefixing a term. A bare space means use 
> the Mode (or, and). Example: Nancy and Mary and -Jane and -(Robert 
> Daniel) which means both the first two and not Jane and neither of the 
> two guys. 5. A query of asterisk/star (*) means match everything. 
> Examples: * for everything (zero or more characters). Fussy, show all 
> without term .pdf * and -".pdf" For normal queries the program uses 
> the edismax interface. A few, such as url: foobar, reference the 
> Lucene interface. This is specified by the qagent= parameter, of 
> edismax or empty respectively, in a search request. Thus regular 
> facilities can do most of this work. What this example does not 
> address is your distance 5 critera. However, the NOT facility may do 
> the trick for you, though a minus sign is taken as a literal minus 
> sign or word separator if located within a quoted string. Thanks, Joe D.
>
>
----------
     Golly, that was well and truly munged by the receiver. Let me try 
again -
>     A partial answer to your question is contained in this Help screen 
> text from my Solr query program:
> Some hints about using this facility: 1. Query terms containing other 
> than just letters or digits may be placed within double quotes so that 
>  those other characters do not separate a term into many terms. A dot 
> (period) and white space are neither  letter nor digit. Examples: "Now 
> is the time for all good men" (spaces, quotes impose ordering too), 
> "goods.doc" (a dot). 2. Mode button "or" (the default) means match one 
> or more terms, perhaps scattered about. Mode button "and" means must 
> match all terms, scattered or not. 3. A one word query term may be 
> prefixed by title: or url: to search on those fields. A space must 
> follow the colon, and the search term is case sensitive. Examples: 
> url: .ppt or title: Goodies. Many docs do not have a formal internal 
> title field, thus prefix title: may not work. 4. Compound queries can 
> be built by joining terms with and or - and group items with ( ). Not 
> is expressed as a minus sign prefixing a term. A bare space means use 
> the Mode (or, and). Example: Nancy and Mary and -Jane and -(Robert 
> Daniel) which means both the first two and not Jane and neither of the 
> two guys. 5. A query of asterisk/star (*) means match everything. 
> Examples: * for everything (zero or more characters). Fussy, show all 
> without term .pdf * and -".pdf" For normal queries the program uses 
> the edismax interface. A few, such as url: foobar, reference the 
> Lucene interface. This is specified by the qagent= parameter, of 
> edismax or empty respectively, in a search request. Thus regular 
> facilities can do most of this work. What this example does not 
> address is your distance 5 critera. However, the NOT facility may do 
> the trick for you, though a minus sign is taken as a literal minus 
> sign or word separator if located within a quoted string.
     Hopefully that will be more readable.
     Thanks,
     Joe D.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message