Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 75893 invoked from network); 27 Nov 2007 01:58:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Nov 2007 01:58:18 -0000 Received: (qmail 16886 invoked by uid 500); 27 Nov 2007 01:58:04 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 16849 invoked by uid 500); 27 Nov 2007 01:58:04 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 16838 invoked by uid 99); 27 Nov 2007 01:58:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Nov 2007 17:58:04 -0800 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Nov 2007 01:58:04 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A923B71420E for ; Mon, 26 Nov 2007 17:57:43 -0800 (PST) Message-ID: <27415133.1196128663688.JavaMail.jira@brutus> Date: Mon, 26 Nov 2007 17:57:43 -0800 (PST) From: "Grant Ingersoll (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-1058) New Analyzer for buffering tokens In-Reply-To: <363633.1195480003451.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1058: ------------------------------------ Attachment: LUCENE-1058.patch A new version of this with the following changes/additions: DocumentsWriter no longer requires that a Field have a value (i.e. stringValue, etc.) Added a new Field constructor that allows for the construction of a Field without a value. This would allow for Analyzer implementations that produce their own tokens (whatever that means) Moved CollaboratingAnalyzer, et. al to the core under analysis.buffered as I thought these items should be in core given the changes to Field and DocsWriter. Note, I think this is a subtle, but important change in DocumentsWriter/Field behavior. > New Analyzer for buffering tokens > --------------------------------- > > Key: LUCENE-1058 > URL: https://issues.apache.org/jira/browse/LUCENE-1058 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Reporter: Grant Ingersoll > Assignee: Grant Ingersoll > Priority: Minor > Attachments: LUCENE-1058.patch, LUCENE-1058.patch, LUCENE-1058.patch > > > In some cases, it would be handy to have Analyzer/Tokenizer/TokenFilters that could siphon off certain tokens and store them in a buffer to be used later in the processing pipeline. > For example, if you want to have two fields, one lowercased and one not, but all the other analysis is the same, then you could save off the tokens to be output for a different field. > Patch to follow, but I am still not sure about a couple of things, mostly how it plays with the new reuse API. > See http://www.gossamer-threads.com/lists/lucene/java-dev/54397?search_string=BufferingAnalyzer;#54397 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org