Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 18114 invoked from network); 9 Apr 2010 11:40:13 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Apr 2010 11:40:13 -0000 Received: (qmail 46768 invoked by uid 500); 9 Apr 2010 11:40:13 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 46660 invoked by uid 500); 9 Apr 2010 11:40:12 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 46653 invoked by uid 99); 9 Apr 2010 11:40:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Apr 2010 11:40:12 +0000 X-ASF-Spam-Status: No, hits=-1254.8 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Apr 2010 11:40:10 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B9E0C234C052 for ; Fri, 9 Apr 2010 11:39:50 +0000 (UTC) Message-ID: <1952531390.5261270813190743.JavaMail.jira@brutus.apache.org> Date: Fri, 9 Apr 2010 11:39:50 +0000 (UTC) From: "Uwe Schindler (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable) In-Reply-To: <338357515.127221267994427244.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2302: ---------------------------------- Attachment: LUCENE-2302-toString.patch Patch that fixes the toString() problems in Token and adds missing CHANGES.txt, fixes backwards tests and updates javadocs to document the "backwards" break. Deprecating Token should be done in another issue. I will commit this soon, to be able to go forward with tokenstream conversion! > Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable) > -------------------------------------------------------------------------------------------------------- > > Key: LUCENE-2302 > URL: https://issues.apache.org/jira/browse/LUCENE-2302 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Affects Versions: Flex Branch > Reporter: Uwe Schindler > Assignee: Uwe Schindler > Fix For: 3.1 > > Attachments: LUCENE-2302-toString.patch, LUCENE-2302.patch, LUCENE-2302.patch, LUCENE-2302.patch, LUCENE-2302.patch, LUCENE-2302.patch > > > For flexible indexing terms can be simple byte[] arrays, while the current TermAttribute only supports char[]. This is fine for plain text, but e.g NumericTokenStream should directly work on the byte[] array. > Also TermAttribute lacks of some interfaces that would make it simplier for users to work with them: Appendable and CharSequence > I propose to create a new interface "CharTermAttribute" with a clean new API that concentrates on CharSequence and Appendable. > The implementation class will simply support the old and new interface working on the same term buffer. DEFAULT_ATTRIBUTE_FACTORY will take care of this. So if somebody adds a TermAttribute, he will get an implementation class that can be also used as CharTermAttribute. As both attributes create the same impl instance both calls to addAttribute are equal. So a TokenFilter that adds CharTermAttribute to the source will work with the same instance as the Tokenizer that requested the (deprecated) TermAttribute. > To also support byte[] only terms like Collation or NumericField needs, a separate getter-only interface will be added, that returns a reusable BytesRef, e.g. BytesRefGetterAttribute. The default implementation class will also support this interface. For backwards compatibility with old self-made-TermAttribute implementations, the indexer will check with hasAttribute(), if the BytesRef getter interface is there and if not will wrap a old-style TermAttribute (a deprecated wrapper class will be provided): new BytesRefGetterAttributeWrapper(TermAttribute), that is used by the indexer then. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org