lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Clarity: Is there a Query boosting 50-50 over 1000-1 ?
Date Thu, 28 Aug 2008 20:33:53 GMT
Can you share your query generation code?  Your description doesn't  
make sense to me and I wonder how you are creating and running the  
searches.

Can you run the explain() method on your documents?

Also, FWIW, it sounds like you are prematurely optimizing.   For every  
query you ever do in your entire life (or until search engines are  
capable of reading your mind, and even then, I'm not sure) on any  
search system, you will be able to find at least one result that you  
think is better than one that is ranked higher.  Now, if you told me  
you had 50 queries that all scored badly, that is different.


On Aug 28, 2008, at 1:37 PM, Shi Hui Liu wrote:

> Hi Grant,
>
> Thank you for your help. My query is A AND B. The problem is if I  
> use BooleanQuery, I got score 120 from TermQuery(A) and 0.5 from  
> TermQuery(B) for the first article; for second article, I got score  
> 27 from TermQuery(A) and 36 from TermQuery(B). From my point of  
> view, I think the second article is better than the first one,  
> however, the result is otherwise. I'm thinking to create a Query to  
> multiply the score of A and B instead of sum. But I want to know if  
> there is an existing query doing same thing since I'm still a new  
> user to Lucene.
>
> Thank you in advance,
> Shi Hui
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org]
> Sent: Thursday, August 28, 2008 5:57 AM
> To: java-user@lucene.apache.org
> Subject: Re: Clarity: Is there a Query boosting 50-50 over 1000-1 ?
>
>
> On Aug 27, 2008, at 7:34 PM, Shi Hui Liu wrote:
>
>> Hi,
>>
>> I think I should clarify my question a little bit. I'm using
>> BooleanQuery to combine TermQuery(A) and TermQuery(B). But I'm not
>> satisfied with its scoring algorigthm. Is there other queries can
>> boost up the documents with 50 of A and 50 of B on top of documents
>> with 1000 of A and 1 of B?
>
> Is your query A + B meant to be A OR B or A AND B?  That is, are both
> terms required?  You notation suggests they are, but the description
> suggests you are getting documents that have only A in them, which
> suggests "OR".
>
> Have you looked at the explains?  What about the scoring aren't you
> happy with?  It's not perfect (there is no such thing) but it works
> pretty well in most cases, and works great if you spend a little time
> figuring out the right length normalization factors.
>
>
>> And I'm looking at the source code and found lots of classes are not
>> public and some important methods are protected. What's the reason?
>> Why make them public and let users to customize the Query easily?
>
> Because there not meant to be overridden, but of course we are open to
> specific suggestions on things that should be made public and often do
> this when someone shows a valid reason.
>
> Cheers,
> Grant
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message