Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 82172 invoked from network); 3 May 2010 15:31:20 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 3 May 2010 15:31:20 -0000 Received: (qmail 76128 invoked by uid 500); 3 May 2010 15:31:19 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 76078 invoked by uid 500); 3 May 2010 15:31:19 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 76071 invoked by uid 99); 3 May 2010 15:31:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 May 2010 15:31:19 +0000 X-ASF-Spam-Status: No, hits=-1380.1 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 May 2010 15:31:18 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o43FUvlP027896 for ; Mon, 3 May 2010 15:30:58 GMT Message-ID: <2139586.15101272900657468.JavaMail.jira@thor> Date: Mon, 3 May 2010 11:30:57 -0400 (EDT) From: "Steven Rowe (JIRA)" To: dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2400) ShingleFilter: don't output all-filler shingles/unigrams; also, convert from TermAttribute to CharTermAttribute MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863378#action_12863378 ] Steven Rowe commented on LUCENE-2400: ------------------------------------- Thanks Uwe! > ShingleFilter: don't output all-filler shingles/unigrams; also, convert from TermAttribute to CharTermAttribute > --------------------------------------------------------------------------------------------------------------- > > Key: LUCENE-2400 > URL: https://issues.apache.org/jira/browse/LUCENE-2400 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/analyzers > Affects Versions: 3.0.1 > Reporter: Steven Rowe > Assignee: Uwe Schindler > Priority: Minor > Attachments: LUCENE-2400.patch, LUCENE-2400.patch, LUCENE-2400.patch, LUCENE-2400.patch > > > When the input token stream to ShingleFilter has position increments greater than one, filler tokens are inserted for each position for which there is no token in the input token stream. As a result, unigrams (if configured) and shingles can be filler-only. Filler-only output tokens make no sense - these should be removed. > Also, because TermAttribute has been deprecated in favor of CharTermAttribute, the patch will also convert TermAttribute usages to CharTermAttribute in ShingleFilter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org