lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ching-Pei Hsing <cphs...@comergent.com>
Subject RE: Need help in changing the search score
Date Tue, 11 Mar 2003 19:13:40 GMT
Thanks Doug,

I've tried the query you suggested, it improved the situation but still
didn't get back the preferred result. What I would like to understand is
what is the difference between the query I had (but not supported in1.2) and
the query you suggested in terms of the score calculation. I can go ahead
and tried out the pre1.3. Still I would really appreciate some hints from
you for my educational purpose.

I understand if we build an extra field which contains information from all
three fields in the example and search against the new field, we should be
able to get the preferred result. So the decision for us is whether we
should add this additional field and take some hit in resource consumption?
or there's a better and systematic way of sort this out? We were hoping to
sort this out by playing with the booster(or even tuning the score
calculation if the risk is small).

Some background information about the way we use Lucene. The application is
a commerce catalog which includes the user front end and the admin tools.
The catalog information is very structured. It is kept in a RDB and spread
into several relational entities/tables. We extracted these information and
build them into about 14 fields for each product periodically. The reason we
need to separate them into that many field is that in our advanced search we
allow the user to construct complex query by the fields. In another type of
search though, we just takes the search criteria from the user and search
against all the information; currently that means against all those 14
fields. Depending on the information from the customers, the number of docs
in the index can go as high as several millions. The speed of indexing and
the size of the index does matter for us.

Thanks again

Ching-pei

-----Original Message-----
From: Doug Cutting [mailto:cutting@lucene.com]
Sent: Tuesday, March 11, 2003 9:36 AM
To: Lucene Users List
Subject: Re: Need help in changing the search score


Ching-Pei Hsing wrote:
> Even if we boost the Name by 10 like the following query, It's still the
> same. 
> 
> query = (NAME:inn NAME:comfort NAME:shampoo)^10 (MMNUM:inn MMNUM:shampoo
> MMNUM:comfort) (SMNUM:shampoo SMNUM:comfort SMNUM:inn)

In the 1.2 release, I don't think this sort of boosting (of a complex 
clause) has any affect.  To do what you want in 1.2 you need to instead 
formulate this as follows, boosting individual terms only:

(NAME:inn^10 NAME:comfort^10 NAME:shampoo^10) (MMNUM:inn MMNUM:shampoo 
MMNUM:comfort) (SMNUM:shampoo SMNUM:comfort SMNUM:inn)

This is fixed in the pre-1.3 sources.  In 1.3 there will also be a query 
explanation feature and extensible scoring, both of which could be 
useful to you.  But I suspect that just boosting the individual terms, 
as above, will meet your needs now.

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message