lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <>
Subject Re: TokenStream and Token APIs
Date Tue, 21 Oct 2008 17:45:34 GMT
Grant Ingersoll wrote:
> On Oct 21, 2008, at 1:39 AM, Michael Busch wrote:
>>> Perhaps it would be useful for Lucene to offer exactly one subclass 
>>> of Token that we guarantee will always have all known Attributes 
>>> (i.e. the ones Lucene provides)  available to it for casting purposes.
>> Yeah we could do that. In fact, I did exactly this when I started 
>> working on this patch. I created a class called PlainToken, which had 
>> all the termBuffer and attributes logic, and changed Token to extend 
>> it. Then the new getToken() method would return an instance of 
>> PlainToken. My main concern with this approach is that it will make 
>> the code in the indexer more complicated, because it always has to 
>> check if we have a Token or PlainToken; if it's a Token then it has 
>> to use the get*() method directly, for a PlainToken it has tocheck 
>> for the *Attributes. So that's a bit messy (it's in fact exactly like 
>> that in the current patch for backwards-compatibility, but we could 
>> clean this up in 3.0). So for code simplicity I'm slightly in favor 
>> of not creating the a class that implements a default set of 
>> functionality without Attributes.
> Yes that would be messy, but not exactly what I was proposing.  I was 
> originally thinking we needed a derived class, but now it seems like 
> we should just keep convenience methods on Token itself.
> That is, why not just have Token implement both the attribute methods 
> and dummy wrappers for the guaranteed to exist Attributes that Lucene 
> implements?
> e.g.
> public int startOffset(){
>     return getAttribute(OffsetAttribute.class).endOffset();
> }
> This makes back-compat a snap, moreover it causes less pain for 
> people, b/c Analyzer/Token stuff is more than likely the one of the 
> most customized pieces of Lucene.
That's a good idea! I will add that to my patch...


> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message