lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: ORs and Ranks
Date Thu, 15 Jan 2009 22:55:24 GMT

:   The question I'm trying to phrase is: Is there a way to make the rank of
: SHOULD term conditional?
: 
:   In the example, I'm trying to express "If the term MEDICAL is found, the
: term CAT ranks high; if the term ANIMAL is found, the term CAT ranks low."

except that there is an ambiguous situation here: what if a document 
contains both MEDICAL and ANIMAL ?

you'll probably want a query something like this...

   (+MEDICAL -ANIMAL CAT^10) (+ANIMAL -MEDICAL CAT^0.1) (-ANIMAL -MEDICAL CAT)

:   According to Luke, I get two SHOULD clauses, each with a MUST and a
: SHOULD.   As I understood things, a SHOULD *term* merely affects the ranking
: of the results, it doesn't affect what gets brought back.  So I'm trying to
: understand what a SHOULD *clause* does in this case.  More importantly, what
: does it logically mean to: "should have a must?"   That's like saying I have
: an optional mandatory term.

not exactly ... Lucene queries "build up" result sets (hence you can't 
have a purely negative query) when a booleam query doesn't contain any 
MUST clauses, then at least one SHOULD clause must match a document for 
that document to make it into the result set.

So when your outermost BooleanQuery contains two SHOULD clauses that means 
you need one or the other to match -- if both match, your score gets even 
higher.

:   Is it even possible to express this construct as a single expression or
: data structure for the API:
:     1.   +( MEDICAL ANIMAL )    You must have either MEDICAL and/or ANIMAL.
:     2.   If MEDICAL present, then CAT ranks high, else, if ANIMAL present,
: then CAT ranks low, otherwise the presence of the term CAT has no influence
: on rank.

...ah, see when you elaborate on the details, it becoamse easier to spell 
out hte query structure...

   (+MEDICAL CAT^10) (+ANIMAL -MEDICAL CAT^0.1) 

in order for one of the main clauses to match, either MEDICAL or ANIMAL 
must match.  if MEDICAL matches CAT scores high; we only care about ANIMAL 
matching if MEDICAL doesn't match -- in which case CAT ranks low.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message