Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0A8489F82 for ; Thu, 2 Feb 2012 20:31:21 +0000 (UTC) Received: (qmail 74301 invoked by uid 500); 2 Feb 2012 20:31:19 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 73814 invoked by uid 500); 2 Feb 2012 20:31:18 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 73799 invoked by uid 99); 2 Feb 2012 20:31:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Feb 2012 20:31:18 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Feb 2012 20:31:15 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 0B71C1892B1 for ; Thu, 2 Feb 2012 20:30:55 +0000 (UTC) Date: Thu, 2 Feb 2012 20:30:55 +0000 (UTC) From: "Robert Muir (Commented) (JIRA)" To: dev@lucene.apache.org Message-ID: <1159571195.4563.1328214655048.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <2078794453.4517.1328214053617.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (LUCENE-3748) EnglishPossessiveFilter should work with Unicode right single quotation mark MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199189#comment-13199189 ] Robert Muir commented on LUCENE-3748: ------------------------------------- I agree with the patch. We can easily add backwards compat here, no problem. As far as any potential others, the only possibility from my perspective is U+FF07 FULLWIDTH APOSTROPHE, though I could go either way on that (since its a compatibility character) Any other opinions? > EnglishPossessiveFilter should work with Unicode right single quotation mark > ---------------------------------------------------------------------------- > > Key: LUCENE-3748 > URL: https://issues.apache.org/jira/browse/LUCENE-3748 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis > Affects Versions: 3.1, 3.2, 3.4, 3.5 > Reporter: David Croley > Priority: Minor > Attachments: LucenePatch > > > The current EnglishPossessiveFilter (used in EnglishAnalyzer) removes possessives using only the '\'' character (plus 's' or 'S'), but some common systems (German?) insert the Unicode "\u2019" (RIGHT SINGLE QUOTATION MARK) instead and this is not removed when processing UTF-8 text. I propose to change EnglishPossesiveFilter to support '\u2019' as an alternative to '\''. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org