Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 58098 invoked from network); 14 May 2009 11:23:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 May 2009 11:23:19 -0000 Received: (qmail 6096 invoked by uid 500); 14 May 2009 11:23:17 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 6013 invoked by uid 500); 14 May 2009 11:23:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 6003 invoked by uid 99); 14 May 2009 11:23:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 May 2009 11:23:17 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [79.170.194.127] (HELO mail.roo10.com) (79.170.194.127) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 May 2009 11:23:07 +0000 Received: from [192.168.60.59] (78-105-13-3.dsl.cnl.uk.net [78.105.13.3]) by mail.roo10.com (Postfix) with ESMTP id 4B8C74610440 for ; Thu, 14 May 2009 12:22:46 +0100 (BST) Subject: Re: analysis filter wrapper From: Joel Halbert To: java-user@lucene.apache.org In-Reply-To: <4A0BF935.9030301@chu.cam.ac.uk> References: <4A0BF935.9030301@chu.cam.ac.uk> Content-Type: text/plain Organization: SU3 Analytics Date: Thu, 14 May 2009 12:22:53 +0100 Message-Id: <1242300173.6585.3.camel@bohr> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org You can use your Analyzer to get a token stream from any text you give it, just like Lucene does. Something like: String text = "your list of words to analyze and tokenize"; TokenStream ts = YOUR_ANALYZER.tokenStream(null, new StringReader(text)); Token token = new Token(); while((ts.next(token)) != null) { String t = new String(token.termBuffer()).substring(0, token.termLength()); System.out.println("Got token " + t); } -----Original Message----- From: Marek Rei Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: analysis filter wrapper Date: Thu, 14 May 2009 11:57:57 +0100 Hi, I'm rather new to Lucene and could use some help. My Analyzer uses a set of filters (PorterStemFilter, LowerCaseFilter, WhitespaceTokenizer). I need to replicate the effect of these filters outside of the normal Lucene pipeline. Basically I would like to input a String from one end and get a processed String or String[] from the other end. Is there a good way to do this? I'm trying to figure it out myself but in case I fail, maybe someone from here could give some advice? Thank You! Marek --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org