Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 86022 invoked from network); 15 Jun 2007 20:14:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Jun 2007 20:14:37 -0000 Received: (qmail 70746 invoked by uid 500); 15 Jun 2007 20:14:33 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 70696 invoked by uid 500); 15 Jun 2007 20:14:32 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 70679 invoked by uid 99); 15 Jun 2007 20:14:32 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jun 2007 13:14:32 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [212.27.42.28] (HELO smtp2-g19.free.fr) (212.27.42.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jun 2007 13:14:27 -0700 Received: from [192.168.1.102] (dau94-4-82-227-122-98.fbx.proxad.net [82.227.122.98]) by smtp2-g19.free.fr (Postfix) with ESMTP id EFE2598FA8 for ; Fri, 15 Jun 2007 22:14:04 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: References: Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: <0C836030-35BF-4928-BA70-B90E399EC4F7@garambrogne.net> Content-Transfer-Encoding: quoted-printable X-Image-Url: http://homepage.mac.com/rdupond/.cv/thumbs/me.thumbnail From: Mathieu Lecarme Subject: Re: Several questions about scoring/sorting + random sorting in an image/related application Date: Fri, 15 Jun 2007 22:14:03 +0200 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org Walt explain differently what I said. Lucene can be efficiently use for selecting objects, without sorting =20 or scoring anything, then, with id stored in Lucene, you can sort =20 yourself with a simple Sortable implementation. The only limit is that lucene gives you not too much results, with =20 your 300 maximal responses, you can play with it easily. M. Le 15 juin 07 =E0 19:07, Walt Stoneburner a =E9crit : > Antoine Baudoux writes: >> I want to be able to give a score to each collection. > > Keep in mind, Lucene is computing a score based on quite a number of > things from how often a term is used in a document, how often it > appears in the collection of documents, how long the query is, etc. > > If your concept of a document's score changes, then I'd be inclined to > think you're possibly using Lucene in a manner it wasn't designed for. > That said, I have two thoughts. > > THOUGHT ONE > Use Lucene to locate "records" for you --- what you really are > interested in getting back _from Lucene_ is the primary key. Then, > use this key to do a lookup in your database of the score of the day > and sort accordingly. The idea is that Lucene finds, your table > scores, and because of that you won't need to re-index when something > changes. > > THOUGHT TWO > Use boosting. COLLECTION_ONE^5 COLLECTION_THREE^10 etc. That way > /if/ the Lucene document appears in the collection, it's score is > weighted according to your preferences. You're free to change the > boosts on a query-by-query basis without having to re-index. > > >> I can use a Very big ... query ... I am afraid that it will be slow. > Try it. I think you'll find Lucene is _fast_. We do some pretty HUGE > and complicated queries and Lucene just screams. > > >> I can add another field to each document, containing a computed >> custom score, then i could sort on that field. But i want to avoid >> this solution at all costs, since it would mean re-indexing all the >> documents each time the collection scores change. > Or, use indirection - instead of keeping the score, keep the primary > key of a score table. Then in a database, where speed won't be the > issue, perform the look up. Honestly, if you're only got 300 > categories, you could keep that simple table in memory using less > space than a small text file. > > > >> I would also like to implement random-sorting. ... Is it a good =20 >> solution? >> Is there another way to do it? > > This really, really, really feels like you're force fitting Lucene to > do some business logic piece of a larger application. May I be so > bold as to ask what's the _actual_ problem you're trying to solve. > ("I'm trying to make a hole in a piece of oak" as opposed to "What's > the best way to sharpen a Phillips screwdriver enough to cut wood?") > > Keep in mind that the forum is for Lucene, so parts of your questions > may be answered outside of the forum. > > -wls > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org