You're not using any type of phrase search. Try -> ( (title:"John Bush"^4.0) OR (body:"John Bush") ) AND ( (title:John^4.0 body:John) AND (title:Bush^4.0 body:Bush) ) or maybe ( (title:"John Bush"~4^4.0) OR (body:"John Bush"~4) ) AND ( (title:John^4.0 body:John) AND (title:Bush^4.0 body:Bush) ) Tim Sturge wrote: > I'm following myself up here to ask if anyone has experience or code > with a BooleanQuery that weights the terms it encounters on a product > basis rather than a sum basis. > > This would effectively compute the geometric mean of the term score > (rather than the arithmetic mean) and would give me more "middle > bias". It also has the great advantage that it automatically > implements AND (as something without the term has a score of 0.0 which > causes the query to go to 0.0 as well.) > > I'm curious though why this doesn't already exist. Is it a bad idea in > general (that I will discover once I implement it and look at the > results?) or does it make searching a lot slower? > > Thanks, > > Tim > > Tim Sturge wrote: >> I have an index with two different sources of information, one small >> but of high quality (call it "title"), and one large, but of lower >> quality (call it "body"). I give boosts to certain documents related >> to their popularity (this is very similar to what one would do >> indexing the web). >> >> The problem I have is a query like "John Bush". I translate that into >> " (title:John^4.0 body:John) AND (title:Bush^4.0 body:Bush) ". But >> the results I get are: >> >> 1. George Bush >> ... >> 4. John Kerry >> ... >> 10. John Bush >> >> The reason is (looking at explain) that George Bush is scored: >> 169 = sum( >> 1 = >> ) >> 168 = sum( >> 160 = >> 8 = <body match for "Bush"> >> ) >> ) >> >> and John Kerry is similar but reversed. Poor old "John Bush" only >> scores: >> >> 72 = sum( >> 40 = (<title match for "John">+<body match>) >> 32 = (<title match for "Bush">+ <body match>) >> ) >> >> because his initial boost was only 1/4 of George's. >> >> The question I have is, how can tell the searcher to care about >> "balance"? I really want the score over 2 terms to be more like >> (sqrt(X)+sqrt(Y))^2 or maybe even exp(log(X)+log(Y)) rather than >> just X+Y. Is that supported in some obvious way, or is there some >> other way to phrase my query to say "I want both terms but they >> should both be important if possible?" >> >> Thanks, >> >> Tim >> >> >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org