Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 15570 invoked from network); 8 Nov 2010 22:22:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Nov 2010 22:22:17 -0000 Received: (qmail 97492 invoked by uid 500); 8 Nov 2010 22:22:46 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 97424 invoked by uid 500); 8 Nov 2010 22:22:46 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 97417 invoked by uid 99); 8 Nov 2010 22:22:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Nov 2010 22:22:45 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED,T_FILL_THIS_FORM_SHORT X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Nov 2010 22:22:45 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oA8MMPRq028370 for ; Mon, 8 Nov 2010 22:22:25 GMT Message-ID: <10671356.84981289254945349.JavaMail.jira@thor> Date: Mon, 8 Nov 2010 17:22:25 -0500 (EST) From: "Tom Burton-West (JIRA)" To: dev@lucene.apache.org Subject: [jira] Updated: (SOLR-2211) Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support In-Reply-To: <6255994.176671288632744834.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated SOLR-2211: ---------------------------------- Attachment: SOLR-2211.patch Patch implements Solr UAX29TokenizerFactory and TestUAX29TokenizerFactory. Tom > Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support > --------------------------------------------------------------------------- > > Key: SOLR-2211 > URL: https://issues.apache.org/jira/browse/SOLR-2211 > Project: Solr > Issue Type: New Feature > Affects Versions: 3.1 > Reporter: Tom Burton-West > Priority: Minor > Attachments: SOLR-2211.patch > > > The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for non-English tokenizing. Presently it can be invoked by using the StandardTokenizerFactory and setting the Version to 3.1. However, it would be useful to be able to use the improved unicode processing without necessarily including the ip address and email address processing of StandardAnalyzer. A FilterFactory that allowed the use of the StandardTokenizer with UAX#29 support on its own would be useful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org