Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E03681056C for ; Mon, 2 Dec 2013 15:01:27 +0000 (UTC) Received: (qmail 46556 invoked by uid 500); 2 Dec 2013 14:57:54 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 46487 invoked by uid 500); 2 Dec 2013 14:57:50 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 46268 invoked by uid 99); 2 Dec 2013 14:57:39 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Dec 2013 14:57:39 +0000 Date: Mon, 2 Dec 2013 14:57:39 +0000 (UTC) From: "Michael McCandless (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (LUCENE-5354) Blended score in AnalyzingInfixSuggester MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836569#comment-13836569 ] Michael McCandless commented on LUCENE-5354: -------------------------------------------- This sounds very useful! I think a subclass could work well, if we open up the necessary methods (which Query to run, how to do the search / resort the results)? We could make the index-time sorting optional as well? This way you'd build an "ordinary" index, run an "ordinary" query, so you have full flexibility (but at more search-time cost). > Blended score in AnalyzingInfixSuggester > ---------------------------------------- > > Key: LUCENE-5354 > URL: https://issues.apache.org/jira/browse/LUCENE-5354 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spellchecker > Affects Versions: 4.4 > Reporter: Remi Melisson > Priority: Minor > Labels: suggester > > I'm working on a custom suggester derived from the AnalyzingInfix. I require what is called a "blended score" (//TODO ln.399 in AnalyzingInfixSuggester) to transform the suggestion weights depending on the position of the searched term(s) in the text. > Right now, I'm using an easy solution : > If I want 10 suggestions, then I search against the current ordered index for the 100 first results and transform the weight : > bq. a) by using the term position in the text (found with TermVector and DocsAndPositionsEnum) > or > bq. b) by multiplying the weight by the score of a SpanQuery that I add when searching > and return the updated 10 most weighted suggestions. > Since we usually don't need to suggest so many things, the bigger search + rescoring overhead is not so significant but I agree that this is not the most elegant solution. > We could include this factor (here the position of the term) directly into the index. > So, I can contribute to this if you think it's worth adding it. > Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a dedicated class ? -- This message was sent by Atlassian JIRA (v6.1#6144) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org