Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5431D880D for ; Mon, 8 Aug 2011 22:18:51 +0000 (UTC) Received: (qmail 36430 invoked by uid 500); 8 Aug 2011 22:18:49 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 36354 invoked by uid 500); 8 Aug 2011 22:18:49 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 36347 invoked by uid 99); 8 Aug 2011 22:18:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Aug 2011 22:18:48 +0000 X-ASF-Spam-Status: No, hits=-2000.8 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Aug 2011 22:18:47 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 34B0BB2E08 for ; Mon, 8 Aug 2011 22:18:27 +0000 (UTC) Date: Mon, 8 Aug 2011 22:18:27 +0000 (UTC) From: "David Smiley (JIRA)" To: dev@lucene.apache.org Message-ID: <1643962731.18073.1312841907212.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (LUCENE-3366) StandardFilter only works with ClassicTokenizer and only when version < 3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 StandardFilter only works with ClassicTokenizer and only when version < 3.1 --------------------------------------------------------------------------- Key: LUCENE-3366 URL: https://issues.apache.org/jira/browse/LUCENE-3366 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Affects Versions: 3.3 Reporter: David Smiley The StandardFilter used to remove periods from acronyms and apostrophes-S's where they occurred. And it used to work in conjunction with the StandardTokenizer. Presently, it only does this with ClassicTokenizer and when the lucene match version is before 3.1. Here is a excerpt from the code: {code:lang=java} public final boolean incrementToken() throws IOException { if (matchVersion.onOrAfter(Version.LUCENE_31)) return input.incrementToken(); // TODO: add some niceties for the new grammar else return incrementTokenClassic(); } {code} It seems to me that in the great refactor of the standard tokenizer, LUCENE-2167, something was forgotten here. I think that if someone uses the ClassicTokenizer then no matter what the version is, this filter should do what it used to do. And the TODO suggests someone forgot to make this filter do something useful for the StandardTokenizer. Or perhaps that idea should be discarded and this class should be named ClassicTokenFilter. In any event, the javadocs for this class appear out of date as there is no mention of ClassicTokenizer, and the wiki is out of date too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org