Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 3218 invoked from network); 14 Nov 2003 20:12:39 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 14 Nov 2003 20:12:39 -0000 Received: (qmail 93043 invoked by uid 500); 14 Nov 2003 20:12:24 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 93012 invoked by uid 500); 14 Nov 2003 20:12:23 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 92999 invoked from network); 14 Nov 2003 20:12:23 -0000 Received: from unknown (HELO smtpout.mac.com) (17.250.248.47) by daedalus.apache.org with SMTP; 14 Nov 2003 20:12:23 -0000 Received: from mac.com (smtpin07-en2 [10.13.10.152]) by smtpout.mac.com (8.12.6/MantshX 2.0) with ESMTP id hAEKCSYN013991 for ; Fri, 14 Nov 2003 12:12:28 -0800 (PST) Received: from [10.0.1.6] (adsl-62-167-77-86.adslplus.ch [62.167.77.86]) (authenticated bits=0) by mac.com (Xserve/smtpin07/MantshX 3.0) with ESMTP id hAEKCPpa009798 for ; Fri, 14 Nov 2003 12:12:27 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v606) In-Reply-To: <33D5BBBB077CAD47AA4F225359F4A5E401241190@ny2528.corp.bloomberg.com> References: <33D5BBBB077CAD47AA4F225359F4A5E401241190@ny2528.corp.bloomberg.com> Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: petite_abeille Subject: Re: Vector Space Model in Lucene? Date: Fri, 14 Nov 2003 21:12:27 +0100 To: Lucene Users List X-Mailer: Apple Mail (2.606) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Nov 14, 2003, at 20:54, Chong, Herb wrote: > it solves one part of the problem, but there are a lot of sentences in > a typical document. you'll need to composite a rank of a document from > its constituent sentences then. there are less drastic ways to solve > the problem. the other problem is that Lucene doesn't consider the > term order in the query unless the query is formulated as a phrase. a > simple bag-of-words query doesn't make use of the ordering of terms > that apply in a given language. This all sounds wonderfully exotic, but, from all the different esoteric approaches you ever tried, what, if anything, made a concrete and noticeable impact on the quality of your search? PA. --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org