Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 582CC7CF4 for ; Thu, 29 Dec 2011 17:59:56 +0000 (UTC) Received: (qmail 99891 invoked by uid 500); 29 Dec 2011 17:59:55 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 99842 invoked by uid 500); 29 Dec 2011 17:59:54 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 99835 invoked by uid 99); 29 Dec 2011 17:59:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Dec 2011 17:59:54 +0000 X-ASF-Spam-Status: No, hits=-2001.3 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Dec 2011 17:59:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B6E3212F0A3 for ; Thu, 29 Dec 2011 17:59:30 +0000 (UTC) Date: Thu, 29 Dec 2011 17:59:30 +0000 (UTC) From: "James Dyer (Updated) (JIRA)" To: dev@lucene.apache.org Message-ID: <1853863039.51998.1325181570750.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <181280048.51968.1325180257928.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (SOLR-2993) Integrate WordBreakSpellChecker with Solr MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/SOLR-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2993: ----------------------------- Attachment: SOLR-2993.patch Patch adds features described in this issue. Users can create a Dictionary configuration in solrconfig.xml like this: {code:xml} wordbreak solr.WordBreakSolrSpellChecker lowerfilt true true 10 {code} Users can also specify multiple "spellcheck.dictionary" parameters. All specified dictionaries are consulted and results are interleaved. (this is handled by the new ConjunctionSolrSpellChecker) Collations are created with combinations from the different spellcheckers, with care taken that mutliple overlapping corrections do not occur in the same collation. {code:xml} default wordbreak 20 spellcheck {code} A future enhancement (outside the scope of this issue) would be to extend ConjunctionSolrSpellChecker to allow arbitrary dictionary combinations. For instance, if a user wanted to query two fields and have two separate dictionaries consulted for each field, etc. With this patch, however, ConjunctionSolrSpellChecker is intended to be used to add Word-Break suggestions in with Single-Word suggestions. > Integrate WordBreakSpellChecker with Solr > ----------------------------------------- > > Key: SOLR-2993 > URL: https://issues.apache.org/jira/browse/SOLR-2993 > Project: Solr > Issue Type: Improvement > Components: SolrCloud, spellchecker > Affects Versions: 4.0 > Reporter: James Dyer > Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2993.patch > > > A SpellCheckComponent enhancement, leveraging the WordBreakSpellChecker from LUCENE-3523: > - Detect spelling errors resulting from misplaced whitespace without the use of shingle-based dictionaries. > - Seamlessly integrate word-break suggestions with single-word spelling corrections from the existing FileBased-, IndexBased- or Direct- spell checkers. > - Provide collation support for word-break errors including cases where the user has a mix of single-word spelling errors and word-break errors in the same query. > - Provide shard support. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org