From java-user-return-42117-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Tue Sep 01 23:42:32 2009 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 63543 invoked from network); 1 Sep 2009 23:42:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Sep 2009 23:42:32 -0000 Received: (qmail 28339 invoked by uid 500); 1 Sep 2009 23:42:30 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 28252 invoked by uid 500); 1 Sep 2009 23:42:29 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 28242 invoked by uid 99); 1 Sep 2009 23:42:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Sep 2009 23:42:29 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of buschmic@gmail.com designates 209.85.211.177 as permitted sender) Received: from [209.85.211.177] (HELO mail-yw0-f177.google.com) (209.85.211.177) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Sep 2009 23:42:20 +0000 Received: by ywh7 with SMTP id 7so674091ywh.21 for ; Tue, 01 Sep 2009 16:41:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=gdyCKYUI2TT2OnxOKgFApF4eS1zHb6Cg8tD96qlMuNs=; b=dfTspiAehQLrcnMqx7hgVIGUuScc4uohnAbPEbDVqKIpb3n8oRcQEZsqWe1RQbPPSM YV7pX6wURjoHr679srCmL5FlRUGE+lzExLncDwSuHikwgGARn7g0NeKiDp/fuHpg9Je7 pf07fyPRPUwksj+Lrer6K+StYHZAguTA0jsYU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=EjHCrA+3+ZebwsxyjNDyXXJlDIIi5U+fu6wk+oKpu2QGolDvBq5g81DKzpGXsRbfx1 3VWDOmsuUh6CvUEZlq6CRL/exBLjqonJd1BuvwxzFHdRQFetqaPt+PRB+5ReyXtdkqiC gXcMAGs8gzPvlvatfvmJ4EFe8cKzMlYVqDveo= Received: by 10.150.254.8 with SMTP id b8mr643402ybi.136.1251848519662; Tue, 01 Sep 2009 16:41:59 -0700 (PDT) Received: from michael-buschs-macbook-pro-2.local (ip98-176-1-54.sd.sd.cox.net [98.176.1.54]) by mx.google.com with ESMTPS id 23sm1522060ywh.11.2009.09.01.16.41.57 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 01 Sep 2009 16:41:58 -0700 (PDT) Message-ID: <4A9DB1C4.3000001@gmail.com> Date: Tue, 01 Sep 2009 16:44:04 -0700 From: Michael Busch User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.1) Gecko/20090715 Thunderbird/3.0b3 MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter. References: <5B20DEF02611534DB08854076CE825D8032DB19B@sc1exc2.corp.emainc.com> <4A9D99C7.3080400@lexum.umontreal.ca> <4A9D9C48.7040703@lexum.umontreal.ca> In-Reply-To: <4A9D9C48.7040703@lexum.umontreal.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Daniel, take a look at the captureState() and restoreState() APIs in AttributeSource and TokenStream. captureState() returns a State object containing all attributes with its' current values. restoreState(State) takes a given State and copies its values back into the TokenStream. You should be able to achieve the same thing by storing State objects in your List, instead of Token objects. peek() would change to return true/false instead of Token and the caller of peek consumes the values using the new attribute API. The change on your side should be pretty simple, let us know if you run into problems! Michael On 9/1/09 3:12 PM, Daniel Shane wrote: > After thinking about it, the only conclusion I got was instead of > saving the token, to save an iterator of Attributes and use that > instead. It may work. > > Daniel Shane > > Daniel Shane wrote: >> Hi all! >> >> I'm trying to port my Lucene code to the new TokenStream API and I >> have a filter that I cannot seem to port using the current new API. >> >> The filter is called LookaheadTokenFilter. It behaves exactly like a >> normal token filter, except, you can call peek() and get information >> on the next token in the stream. >> >> Since Lucene does not support stream "rewinding", we did this by >> buffering tokens when peek() was called and giving those back when >> next() was called and when no more "peeked" tokens exist, we then >> call super.next(); >> >> Now, I'm looking at this new API and really I'm stuck at how to port >> this using incrementToken... >> >> Am I missing something, is there an object I can get from the >> TokenStream that I can save and get all the attributes from? >> >> Here is the code I'm trying to port : >> >> public class LookaheadTokenFilter extends TokenFilter { >> /** List of tokens that were peeked but not returned with next. */ >> LinkedList peekedTokens = new LinkedList(); >> >> /** The position of the next character that peek() will return in >> peekedTokens */ >> int peekPosition = 0; >> >> public LookaheadTokenFilter(TokenStream input) { >> super(input); >> } >> public Token peek() throws IOException { >> if (this.peekPosition >= this.peekedTokens.size()) { >> Token token = new Token(); >> token = this.input.next(token); >> if (token != null) { >> this.peekedTokens.add(token); >> this.peekPosition = this.peekedTokens.size(); >> } >> return token; >> } >> >> return this.peekedTokens.get(this.peekPosition++); >> } >> >> public void reset() { this.peekPosition = 0; } >> >> public Token next(Token token) throws IOException { >> reset(); >> >> if (this.peekedTokens.size() > 0) { >> return this.peekedTokens.removeFirst(); >> } >> return this.input.next(token); } >> } >> >> Let me know if anyone has an idea, >> Daniel Shane >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org