Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 63239 invoked from network); 1 Dec 2003 18:19:03 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 1 Dec 2003 18:19:03 -0000 Received: (qmail 60342 invoked by uid 500); 1 Dec 2003 18:18:50 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 60295 invoked by uid 500); 1 Dec 2003 18:18:49 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 60223 invoked from network); 1 Dec 2003 18:18:48 -0000 Received: from unknown (HELO mh2dmz1.bloomberg.com) (199.172.169.37) by daedalus.apache.org with SMTP; 1 Dec 2003 18:18:48 -0000 Received: from ns9.bloomberg.com (ns9.bloomberg.com [160.43.164.52]) by mh2dmz1.bloomberg.com with ESMTP for lucene-user@jakarta.apache.org; Mon, 1 Dec 2003 13:18:50 -0500 Received: from ny2528.corp.bloomberg.com (ny2528.bloomberg.com [172.20.73.29]) by ns9.bloomberg.com (8.11.7p1+Sun/8.10.2) with ESMTP id hB1IInx02144 for ; Mon, 1 Dec 2003 13:18:50 -0500 (EST) content-class: urn:content-classes:message Subject: RE: Dates and others Date: Mon, 1 Dec 2003 13:18:49 -0500 Message-Id: <33D5BBBB077CAD47AA4F225359F4A5E40124086F@ny2528.corp.bloomberg.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Dates and others X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 Thread-Index: AcO4NngIkvw0HjYZRimnL8kW7MUG2gAAD5Mg From: "Chong, Herb" To: "Lucene Users List" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N ad hoc techniques run into lots of trouble because the requirement on = Lucene isn't well specified. is a document with one of the search terms = that is a week newer enough to move it ahead of a document that has all = of the search terms? the boost mechanism is a way to move documents = around in the ranking list, but it clearly is a way to reweight the = importance of the query terms and not to impose external constraints = that properly should be handled outside the search engine. Herb... -----Original Message----- From: Doug Cutting [mailto:cutting@lucene.com] Sent: Monday, December 01, 2003 1:11 PM To: Lucene Users List Subject: Re: Dates and others The problem with this approach is that eventually you'll exhaust the=20 range of the boost. So this will only work if you re-index things from=20 scratch periodically, with a boost of something like 1/days-ago. If you're adding documents to the index in date order, then you could=20 use a HitCollector which adjusts scores according to the document=20 number, since document numbers increase as you add to the index. If you're not adding things in date order, then you can, when you open=20 the index, build an array mapping document numbers to integer dates.=20 Then your hit collector can use this to either boost or sort hits by = date. Or you could add a "month" or "week" field to documents, then add it as=20 a clause to your queries with a boost. Then documents matching the most = recent week(s) and/or month(s) would get the boost. Doug --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org