Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D6B83403D for ; Sat, 25 Jun 2011 17:56:10 +0000 (UTC) Received: (qmail 35783 invoked by uid 500); 25 Jun 2011 17:56:09 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 35699 invoked by uid 500); 25 Jun 2011 17:56:08 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 35692 invoked by uid 99); 25 Jun 2011 17:56:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 25 Jun 2011 17:56:08 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 25 Jun 2011 17:56:07 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 6FE12430298 for ; Sat, 25 Jun 2011 17:55:47 +0000 (UTC) Date: Sat, 25 Jun 2011 17:55:47 +0000 (UTC) From: "David Mark Nemeskey (JIRA)" To: dev@lucene.apache.org Message-ID: <878981629.40634.1309024547454.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <647665016.20394.1308566927699.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (LUCENE-3220) Implement various ranking models as Similarities MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mark Nemeskey updated LUCENE-3220: ---------------------------------------- Attachment: LUCENE-3220.patch Implementation of the DFR framework added. Lots of nocommits, though. I things to think about: * lots of (float) conversions. Maybe the inner API (BasicModel, etc.) could use doubles? According to my experience, double is faster anyway, at least on 64bit architectures * I am not overly happy about the naming scheme, i.e. BasicModelBE, etc. Maybe we should do it the same way as in Terrier, with a basicmodel package and class names like BE? * A regular SimilarityProvider implementation won't play well with DFRSimilarity, in case the user wants to use several different setups. Actually, this is a problem for all similarities that have parameters (e.g. BM25 with b and k). Also, I think we need that NormConverter we talked earlier on irc, so that the Similarities can run on any index. > Implement various ranking models as Similarities > ------------------------------------------------ > > Key: LUCENE-3220 > URL: https://issues.apache.org/jira/browse/LUCENE-3220 > Project: Lucene - Java > Issue Type: Sub-task > Components: core/search > Affects Versions: flexscoring branch > Reporter: David Mark Nemeskey > Assignee: David Mark Nemeskey > Labels: gsoc > Attachments: LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > With [LUCENE-3174|https://issues.apache.org/jira/browse/LUCENE-3174] done, we can finally work on implementing the standard ranking models. Currently DFR, BM25 and LM are on the menu. > TODO: > * {{EasyStats}}: contains all statistics that might be relevant for a ranking algorithm > * {{EasySimilarity}}: the ancestor of all the other similarities. Hides the DocScorers and as much implementation detail as possible > * _BM25_: the current "mock" implementation might be OK > * _LM_ > * _DFR_ > Done: -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org