lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen" <cdor...@gmail.com>
Subject Re: SinkTokenizer: next(Token) vs. next()
Date Fri, 28 Dec 2007 13:20:55 GMT
Hi Grant,

"safer" was not the best wording, sorry for that - I meant performance
wise, there's no correctness issue.

The "contract" of the two next methods as I understand it is that
a TS must implement one of them. I see no harm in implementing
the two (but doing so is likely to just duplicate TokenStream's code.)

For SinkTokenizer it actually implements next with no reuse logic,
so it really should implement just next(). Then, if any consumer
of SinkTokenizer calls next(Token), the default impl of this method
in TokenStream would call SinkTokenizers' next().

Do you agree with this?

Cheers,
Doron

On Dec 27, 2007 4:20 PM, Grant Ingersoll <gsingers@apache.org> wrote:

>
> On Dec 26, 2007, at 6:20 PM, Doron Cohen wrote:
>
> > Working on Lucene-1101 I checked if SinkTokenizer.next(Token) should
> > also
> > call Token.clear(). (It shouldn't, because it ignores the input
> > token.)
> >
> > However I think that calls to next() would end up creating Tokens for
> > nothing (by TokenStream.next()).
> >
> > May currently be an empty case (if all current uses call
> > next(Token)), but
> > still - is it safer for SinkTokenizer to implement next() rather than
> > next(Token)?
>
> I'm still a bit fuzzy on the interplay of these myself, but what makes
> the call of SinkTokenizer.next(Token) unsafe or is it just the
> potential of Tokens being created?  I guess SinkTokenizer could just
> override both methods.
>
> -Grant
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message