lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <busch...@gmail.com>
Subject Re: TokenStream and Token APIs
Date Sun, 19 Oct 2008 23:09:33 GMT
Mark Miller wrote:
> Grant Ingersoll wrote:
>>
>> On Oct 19, 2008, at 12:56 AM, Mark Miller wrote:
>>
>>> Grant Ingersoll wrote:
>>>>
>>>> Bear with me, b/c I'm not sure I'm following, but looking at 
>>>> https://issues.apache.org/jira/browse/LUCENE-1422, I see at least 5 
>>>> different implemented Attributes.
>>>>
>>>> So, let's say I add a 5 more attributes and now have a total of 10 
>>>> attributes. Are you saying that I then would have, potentially, 10 
>>>> different variables that all point to the token as in the code 
>>>> snippet above where the casting takes place? Or would I just create 
>>>> a single "Super" attribute that folds in all of my new attributes, 
>>>> plus any other existing ones? Or, maybe, what I would do is create 
>>>> the 5 new attributes and then 1 new attribute that extends all 10, 
>>>> thus allowing me to use them individually, but saving me from 
>>>> having to do a whole ton of casting in my Consumer.
>>> Potentially one consumer doing 10 things, but not likely right? I 
>>> mean, things will stay logical as they are now, and rather than a 
>>> super consumer doing everything, we will still have a chain of 
>>> consumers each doing its own piece. So more likely, maybe something 
>>> comes along every so often (another 5, over *much* time, say) and 
>>> each time we add a Consumer that uses one or two TokenStream types. 
>>> And then its just an implementation detail on whether you make a 
>>> composite TokenStream - if you have added 10 new attributes and see 
>>> it fit to make one consumer use them all, sure, make a composite, 
>>> super type, but in my mind, the way its done in the example code is 
>>> clearer/cleaner for a handful of TokenStream types. And even if you 
>>> do make the composite,super type, its likely to just be a sugar 
>>> wrapper anyway - the implementation for say, payload and positions, 
>>> should probably be maintained in their own classes anyway.
>>
>> Well, there are 5 different attributes already, all of which are 
>> commonly used.  Seems weird to have to cast the same var 5 different 
>> ways.  Definitely agree that one would likely deal with this by 
>> wrapping, but then you end up either needing to extend your wrapper 
>> or add new wrappers...
> Okay, I see, all of that is going to happen in one Consumer; your not 
> going to want to read the TokenStream more than once. I see your point 
> now.
>
Hmm, I'm not sure I understand why you would have to read the 
TokenStream more than once?

-Michael


> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message