Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 5250 invoked from network); 12 Jan 2005 17:11:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 12 Jan 2005 17:11:54 -0000 Received: (qmail 79998 invoked by uid 500); 12 Jan 2005 17:11:49 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 79966 invoked by uid 500); 12 Jan 2005 17:11:49 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 79949 invoked by uid 99); 12 Jan 2005 17:11:49 -0000 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=SPF_HELO_SOFTFAIL,SUBJ_ALL_CAPS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from reh001-1.rex001.exchangebyregister.com (HELO reh001-1.REX001.ExchangeByRegister.com) (64.78.19.14) by apache.org (qpsmtpd/0.28) with ESMTP; Wed, 12 Jan 2005 09:11:47 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: QUERYPARSIN & BOOSTING Date: Wed, 12 Jan 2005 09:11:43 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: QUERYPARSIN & BOOSTING Thread-Index: AcT4lKtOYvtX+tabTRSFuC23ErWgMwAM9M+Q From: "Chuck Williams" To: "Lucene Users List" X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Google has natural results on the left and sponsored results on the right. I do not believe the natural results are affected by paid keywords at all. What you seem to be describing is the behavior of the sponsored results, which I believe are explicitly attached to certain keywords. The same approach would work in Lucene. Create a field to hold "purchased" keywords (any keywords you want to associate with the result). Then you can include this field in your search with a high boost (see DistributingMultiFieldQueryParser, http://issues.apache.org/bugzilla/show_bug.cgi?id=3D32674). Google prefers certain results over others for certain keywords based on various factors of the keyword purchase and the site (amount paid for the keyword, Page Rank of the site, tenure of the listing, popularity of the listing, etc.). You could emulate this in various ways, using a combination of document/field boosting and perhaps replication of the term in the field (to increase its tf), or even perhaps multiple fields that are boosted at different levels. I'm not sure of the best approach to this part -- you could experiment a little. Chuck > -----Original Message----- > From: Karthik N S [mailto:karthik@controlnet.co.in] > Sent: Wednesday, January 12, 2005 2:30 AM > To: Lucene Users List > Subject: RE: QUERYPARSIN & BOOSTING >=20 > Hi Guys >=20 > Apologies........... >=20 > If somebody's is been closely watching GOOGLE, It boost's WEBSITES for > payed category sites based on search words. >=20 > Can This [ boost the Full WEBSITE ] be achieved in Lucene's search > based on > searchword >=20 > If So Please Explain /examples ???. >=20 > with regards > karthik >=20 >=20 >=20 > -----Original Message----- > From: Chuck Williams [mailto:chuck@manawiz.com] > Sent: Tuesday, January 11, 2005 2:00 PM > To: Lucene Users List; nsh@bayt.net > Subject: RE: QUERYPARSIN & BOOSTING >=20 >=20 > Karthik, >=20 > I don't think the boost in your example does much since you are using an > AND query, i.e. all hits will have to contain both vendor:nike and > contents:shoes. If you used an OR, then the boost would put nike > products above (non-nike) shoes, unless there was some other factor that > causes score of contents:shoes to be 10x greater than that of > vendor:nike. It's a good idea to look at the results of explain() when > analyzing what's happening with scoring, tuning your boosts and your > Similarity. >=20 > Chuck >=20 > > -----Original Message----- > > From: Nader Henein [mailto:nsh@bayt.net] > > Sent: Tuesday, January 11, 2005 12:21 AM > > To: Lucene Users List > > Subject: Re: QUERYPARSIN & BOOSTING > > > > From the text on the Lucene Jakarta Site : > > http://jakarta.apache.org/lucene/docs/queryparsersyntax.html > > > > > > Lucene provides the relevance level of matching documents based on > the > > terms found. To boost a term use the caret, "^", symbol with a boost > > factor (a number) at the end of the term you are searching. The > higher > > the boost factor, the more relevant the term will be. > > > > Boosting allows you to control the relevance of a document by > > boosting its term. For example, if you are searching for > > > > > > > > > > jakarta apache > > > > > > > > > > and you want the term "jakarta" to be more relevant boost it > using > > the ^ symbol along with the boost factor next to the term. You > would > > type: > > > > > > > > > > jakarta^4 apache > > > > > > > > > > This will make documents with the term jakarta appear more > relevant. > > You can also boost Phrase Terms as in the example: > > > > > > > > > > "jakarta apache"^4 "jakarta lucene" > > > > > > > > > > By default, the boost factor is 1. Although the boost factor > must be > > positive, it can be less than 1 (e.g. 0.2) > > > > > > Regards. > > > > Nader Henein > > > > > > Karthik N S wrote: > > > > >Hi Guys > > > > > > > > > > > >Apologies........... > > > > > >This Question may be asked million times on this form ,need some > > >clarifications. > > > > > >1) FieldType =3D keyword name =3D vendor > > > > > >2)FieldType =3D text name =3D contents > > > > > >Question: > > > > > >1) How to Construct a Query which would allow hits avaliable for > the > > VENDOR > > >to appear first ?. > > > > > >2) If boosting is to be applied How TO ?. > > > > > >3) Is the Query Constructed Below correct?. > > > > > >+Contents:shoes +((vendor:nike)^10) > > > > > > > > > > > >Please Advise. > > >Thx in advance. > > > > > > > > >WITH WARM REGARDS > > >HAVE A NICE DAY > > >[ N.S.KARTHIK] > > > > > > > > > > > > >--------------------------------------------------------------------- > > >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > >For additional commands, e-mail: > lucene-user-help@jakarta.apache.org > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org