lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <busch...@gmail.com>
Subject Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.
Date Tue, 01 Sep 2009 23:44:04 GMT
Daniel,

take a look at the captureState() and restoreState() APIs in 
AttributeSource and TokenStream. captureState() returns a State object 
containing all attributes with its' current values. restoreState(State) 
takes a given State and copies its values back into the TokenStream. You 
should be able to achieve the same thing by storing State objects in 
your List, instead of Token objects. peek() would change to return 
true/false instead of Token and the caller of peek consumes the values 
using the new attribute API. The change on your side should be pretty 
simple, let us know if you run into problems!

  Michael

On 9/1/09 3:12 PM, Daniel Shane wrote:
> After thinking about it, the only conclusion I got was instead of 
> saving the token, to save an iterator of Attributes and use that 
> instead. It may work.
>
> Daniel Shane
>
> Daniel Shane wrote:
>> Hi all!
>>
>> I'm trying to port my Lucene code to the new TokenStream API and I 
>> have a filter that I cannot seem to port using the current new API.
>>
>> The filter is called LookaheadTokenFilter. It behaves exactly like a 
>> normal token filter, except, you can call peek() and get information 
>> on the next token in the stream.
>>
>> Since Lucene does not support stream "rewinding", we did this by 
>> buffering tokens when peek() was called and giving those back when 
>> next() was called and when no more "peeked" tokens exist, we then 
>> call super.next();
>>
>> Now, I'm looking at this new API and really I'm stuck at how to port 
>> this using incrementToken...
>>
>> Am I missing something, is there an object I can get from the 
>> TokenStream that I can save and get all the attributes from?
>>
>> Here is the code I'm trying to port :
>>
>> public class LookaheadTokenFilter extends TokenFilter {
>>    /** List of tokens that were peeked but not returned with next. */
>>    LinkedList<Token> peekedTokens = new LinkedList<Token>();
>>
>>    /** The position of the next character that peek() will return in 
>> peekedTokens */
>>    int peekPosition = 0;
>>
>>    public LookaheadTokenFilter(TokenStream input) {
>>        super(input);
>>    }
>>      public Token peek() throws IOException {
>>        if (this.peekPosition >= this.peekedTokens.size()) {
>>            Token token = new Token();
>>            token = this.input.next(token);
>>            if (token != null) {
>>                this.peekedTokens.add(token);
>>                this.peekPosition = this.peekedTokens.size();
>>            }
>>            return token;
>>        }
>>
>>        return this.peekedTokens.get(this.peekPosition++);
>>    }
>>
>>    public void reset() { this.peekPosition = 0; }
>>
>>    public Token next(Token token) throws IOException {
>>        reset();
>>
>>        if (this.peekedTokens.size() > 0) {
>>            return this.peekedTokens.removeFirst();
>>        }
>>                  return this.input.next(token);          }
>> }
>>
>> Let me know if anyone has an idea,
>> Daniel Shane
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message