lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <ben...@basistech.com>
Subject Re: reset versus setReader on TokenStream
Date Wed, 29 Aug 2012 19:45:40 GMT
On Wed, Aug 29, 2012 at 3:37 PM, Robert Muir <rcmuir@gmail.com> wrote:

> ok, lets help improve it: I think these have likely always been confusing.
>
> before they were both reset: reset() and reset(Reader), even though
> they are unrelated. I thought the rename would help this :)
>
> Does the TokenStream workfloat here help?
>
> http://lucene.apache.org/core/4_0_0-BETA/core/org/apache/lucene/analysis/TokenStream.html
> Basically reset() is a mandatory thing the consumer must call. it just
> means 'reset any mutable state so you can be reused for processing
> again'.
>

I really did read this. setReader I get; I don't understand what reset
accomplishes. What does it mean to reuse one a TokenStream without calling
setReader to supply a new input? If it means reuse the old input, who does
the rewinding?





> This is something on any TokenStream: Tokenizers, TokenFilters, or
> even some direct descendent you make that parses byte arrays, or
> whatever.
>
> This means if you are keeping some state across tokens (like
> stopfilter's #skippedTokens). here is where you would set that = 0
> again.
>
> setReader(Reader) is only on Tokenizer, it means replace the Reader
> with a different one to be processed.
> The fact that CharTokenizer is doing 'reset()-like-stuff' in here is
> bogus IMO, but I dont think it will cause any bugs. Don't emulate it
> :)
>
> On Wed, Aug 29, 2012 at 3:29 PM, Benson Margulies <benson@basistech.com>
> wrote:
> > I've read the javadoc through a few times, but I confess that I'm still
> > feeling dense.
> >
> > Are all tokenizers responsible for implementing some way of retaining the
> > contents of their reader, so that a call to reset without a call to
> > setReader rewinds? I note that CharTokenizer doesn't implement #reset,
> > which leads me to suspect that I'm not responsible for the rewind
> behavior.
>
>
>
> --
> lucidworks.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message