lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chuck Williams" <ch...@manawiz.com>
Subject RE: QUERYPARSIN & BOOSTING
Date Wed, 12 Jan 2005 17:11:43 GMT
Google has natural results on the left and sponsored results on the
right.  I do not believe the natural results are affected by paid
keywords at all.  What you seem to be describing is the behavior of the
sponsored results, which I believe are explicitly attached to certain
keywords.

The same approach would work in Lucene.  Create a field to hold
"purchased" keywords (any keywords you want to associate with the
result).  Then you can include this field in your search with a high
boost (see DistributingMultiFieldQueryParser,
http://issues.apache.org/bugzilla/show_bug.cgi?id=32674).

Google prefers certain results over others for certain keywords based on
various factors of the keyword purchase and the site (amount paid for
the keyword, Page Rank of the site, tenure of the listing, popularity of
the listing, etc.).  You could emulate this in various ways, using a
combination of document/field boosting and perhaps replication of the
term in the field (to increase its tf), or even perhaps multiple fields
that are boosted at different levels.  I'm not sure of the best approach
to this part -- you could experiment a little.

Chuck

  > -----Original Message-----
  > From: Karthik N S [mailto:karthik@controlnet.co.in]
  > Sent: Wednesday, January 12, 2005 2:30 AM
  > To: Lucene Users List
  > Subject: RE: QUERYPARSIN & BOOSTING
  > 
  > Hi Guys
  > 
  > Apologies...........
  > 
  > If somebody's is  been closely watching GOOGLE, It boost's WEBSITES
for
  > payed category sites based on search words.
  > 
  > Can This [ boost the Full WEBSITE ] be achieved in Lucene's search
  > based on
  > searchword
  > 
  > If So Please Explain /examples  ???.
  > 
  > with regards
  > karthik
  > 
  > 
  > 
  > -----Original Message-----
  > From: Chuck Williams [mailto:chuck@manawiz.com]
  > Sent: Tuesday, January 11, 2005 2:00 PM
  > To: Lucene Users List; nsh@bayt.net
  > Subject: RE: QUERYPARSIN & BOOSTING
  > 
  > 
  > Karthik,
  > 
  > I don't think the boost in your example does much since you are
using an
  > AND query, i.e. all hits will have to contain both vendor:nike and
  > contents:shoes.  If you used an OR, then the boost would put nike
  > products above (non-nike) shoes, unless there was some other factor
that
  > causes score of contents:shoes to be 10x greater than that of
  > vendor:nike.  It's a good idea to look at the results of explain()
when
  > analyzing what's happening with scoring, tuning your boosts and your
  > Similarity.
  > 
  > Chuck
  > 
  >   > -----Original Message-----
  >   > From: Nader Henein [mailto:nsh@bayt.net]
  >   > Sent: Tuesday, January 11, 2005 12:21 AM
  >   > To: Lucene Users List
  >   > Subject: Re: QUERYPARSIN & BOOSTING
  >   >
  >   >  From the text on the Lucene Jakarta Site :
  >   > http://jakarta.apache.org/lucene/docs/queryparsersyntax.html
  >   >
  >   >
  >   > Lucene provides the relevance level of matching documents based
on
  > the
  >   > terms found. To boost a term use the caret, "^", symbol with a
boost
  >   > factor (a number) at the end of the term you are searching. The
  > higher
  >   > the boost factor, the more relevant the term will be.
  >   >
  >   >     Boosting allows you to control the relevance of a document
by
  >   >     boosting its term. For example, if you are searching for
  >   >
  >   >
  >   >
  >   >
  >   > jakarta apache
  >   >
  >   >
  >   >
  >   >
  >   >     and you want the term "jakarta" to be more relevant boost it
  > using
  >   >     the ^ symbol along with the boost factor next to the term.
You
  > would
  >   >     type:
  >   >
  >   >
  >   >
  >   >
  >   > jakarta^4 apache
  >   >
  >   >
  >   >
  >   >
  >   >     This will make documents with the term jakarta appear more
  > relevant.
  >   >     You can also boost Phrase Terms as in the example:
  >   >
  >   >
  >   >
  >   >
  >   > "jakarta apache"^4 "jakarta lucene"
  >   >
  >   >
  >   >
  >   >
  >   >     By default, the boost factor is 1. Although the boost factor
  > must be
  >   >     positive, it can be less than 1 (e.g. 0.2)
  >   >
  >   >
  >   > Regards.
  >   >
  >   > Nader Henein
  >   >
  >   >
  >   > Karthik N S wrote:
  >   >
  >   > >Hi Guys
  >   > >
  >   > >
  >   > >
  >   > >Apologies...........
  >   > >
  >   > >This Question may be asked million times on this form ,need
some
  >   > >clarifications.
  >   > >
  >   > >1) FieldType =  keyword      name =  vendor
  >   > >
  >   > >2)FieldType =  text              name = contents
  >   > >
  >   > >Question:
  >   > >
  >   > >1) How to Construct a Query which would allow hits  avaliable
for
  > the
  >   > VENDOR
  >   > >to  appear  first ?.
  >   > >
  >   > >2) If boosting is to be applied How TO   ?.
  >   > >
  >   > >3) Is the Query Constructed Below correct?.
  >   > >
  >   > >+Contents:shoes +((vendor:nike)^10)
  >   > >
  >   > >
  >   > >
  >   > >Please Advise.
  >   > >Thx in advance.
  >   > >
  >   > >
  >   > >WITH WARM REGARDS
  >   > >HAVE A NICE DAY
  >   > >[ N.S.KARTHIK]
  >   > >
  >   > >
  >   > >
  >   >
  >
>---------------------------------------------------------------------
  >   > >To unsubscribe, e-mail:
lucene-user-unsubscribe@jakarta.apache.org
  >   > >For additional commands, e-mail:
  > lucene-user-help@jakarta.apache.org
  >   > >
  >   > >
  >   > >
  >   > >
  >   > >
  >   > >
  >   >
  >   >
  >
---------------------------------------------------------------------
  >   > To unsubscribe, e-mail:
lucene-user-unsubscribe@jakarta.apache.org
  >   > For additional commands, e-mail:
lucene-user-help@jakarta.apache.org
  > 
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  > 
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message