Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 2153 invoked from network); 4 May 2005 00:28:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 May 2005 00:28:53 -0000 Received: (qmail 71370 invoked by uid 500); 4 May 2005 00:29:32 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 71249 invoked by uid 500); 4 May 2005 00:29:29 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 70933 invoked by uid 99); 4 May 2005 00:29:21 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from Unknown (HELO ehatchersolutions.com) (69.55.225.129) by apache.org (qpsmtpd/0.28) with ESMTP; Tue, 03 May 2005 17:29:19 -0700 Received: by ehatchersolutions.com (Postfix, from userid 504) id 342B213E2006; Tue, 3 May 2005 20:27:25 -0400 (EDT) Received: from [192.168.1.101] (va-chrvlle-cad1-bdgrp1-4b-b-169.chvlva.adelphia.net [68.169.41.169]) by ehatchersolutions.com (Postfix) with ESMTP id 9573513E2005 for ; Tue, 3 May 2005 20:26:18 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v728) In-Reply-To: <189d07e140f5642580fcf3cea73a9523@lbl.gov> References: <189d07e140f5642580fcf3cea73a9523@lbl.gov> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: contrib: keywordTokenStream Date: Tue, 3 May 2005 20:26:16 -0400 To: java-dev@lucene.apache.org X-Mailer: Apple Mail (2.728) X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on javelina X-Spam-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00, RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL autolearn=no version=3.0.1 X-Spam-Level: X-Virus-Checked: Checked X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Wolfgang, I've now added this. I'm not seeing how this could be generally useful. I'm curious how you are using it and why it is better suited for what you're doing than any other analyzer. "keyword tokenizer" is a bit overloaded terminology-wise, though - look in the contrib/analyzers/src/java area to see what I mean. Erik On May 3, 2005, at 4:26 PM, Wolfgang Hoschek wrote: > Here's a convenience add-on method to MemoryIndex. If it turns out > that this could be of wider use, it could be moved into the core > analysis package. For the moment the MemoryIndex might be a better > home. Opinions, anyone? > > Wolfgang. > > /** > * Convenience method; Creates and returns a token stream that > generates a > * token for each keyword in the given collection, "as is", > without any > * transforming text analysis. The resulting token stream can > be fed into > * {@link #addField(String, TokenStream)}, perhaps wrapped into > another > * {@link org.apache.lucene.analysis.TokenFilter}, as desired. > * > * @param keywords > * the keywords to generate tokens for > * @return the corresponding token stream > */ > public TokenStream keywordTokenStream(final Collection keywords) { > if (keywords == null) > throw new IllegalArgumentException("keywords must not > be null"); > > return new TokenStream() { > Iterator iter = keywords.iterator(); > int pos = 0; > int start = 0; > public Token next() { > if (!iter.hasNext()) return null; > > Object obj = iter.next(); > if (obj == null) > throw new IllegalArgumentException("keyword > must not be null"); > > String term = obj.toString(); > Token token = new Token(term, start, start + > term.length()); > start += term.length() + 1; // separate words by 1 > (blank) character > pos++; > return token; > } > }; > } > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org