lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <busch...@gmail.com>
Subject Re: New Token API was Re: Payloads and TrieRangeQuery
Date Mon, 15 Jun 2009 21:47:43 GMT
Sounds promising, but I have to think about if there are not 
side-effects of this change other than a slowdown for people who create 
multiple tokens (which would be acceptable as you said, because it's not 
recommended anyway and should be rare).

On 6/15/09 1:46 PM, Uwe Schindler wrote:
>
> Maybe change the deprecation wrapper around next() and next(Token) 
> [the default impl of incrementToken()] to check, if the retrieved 
> token is not identical to the attribute and then just copy the 
> contents to the instance-Token? This would be a slowdown, but only be 
> the case for very rare TokenStreams that did not reuse token before 
> (and were slow before, too).
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> ------------------------------------------------------------------------
>
> *From:* Michael Busch [mailto:buschmic@gmail.com]
> *Sent:* Monday, June 15, 2009 10:39 PM
> *To:* java-dev@lucene.apache.org
> *Subject:* Re: New Token API was Re: Payloads and TrieRangeQuery
>
> I have implemented most of that actually (the interface part and Token 
> implementing all of them).
>
> The problem is a paradigm change with the new API: the assumption is 
> that there is always only one single instance of an Attribute. With 
> the old API, it is recommended to reuse the passed-in token, but you 
> don't have to, you can also return a new one with every call of next().
>
> Now with this change the indexer classes should only know about the 
> interfaces, if shouldn't know Token anymore, which seems fine when 
> Token implements all those interfaces. BUT, since there can be more 
> than once instance of Token, the indexer would have to call 
> getAttribute() for all Attributes it needs after each call of next(). 
> I haven't measured how expensive that is, but it seems like a severe 
> performance hit.
>
> That's basically the main reason why the backwards compatibility is 
> ensured in such a goofy way right now.
>
>  Michael
>
> On 6/15/09 1:28 PM, Uwe Schindler wrote:
>
>> And I don't like the *useNewAPI*() methods either. I spent a lot of time
>> thinking about backwards compatibility for this API. It's tricky to do
>> without sacrificing performance. In API patches I find myself spending
>> more time for backwards-compatibility than for the actual new feature! :(
>>   
>> I'll try to think about how to simplify this confusing old/new API mix.
>>      
>   
> One solution to fix this useNewAPI problem would be to change the
> AttributeSource in a way, to return classes that implement interfaces (as
> you proposed some weeks ago). The good old Token class then do not need to
> be deprecated, it could simply implement all 4 interfaces. AttributeSource
> then must implement a registry, which classes implement which interfaces. So
> if somebody wants a TermAttribute, he always gets the Token. In all other
> cases, the default could be a *Impl default class.
>   
> In this case, next() could simply return this Token (which is the all 4
> attributes). The reuseableToken is simply thrown away in the deprecated API,
> the reuseable Token comes from the AttributeSource and is per-instance.
>   
> Is this an idea?
>   
> Uwe
>   
>   
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:java-dev-unsubscribe@lucene.apache.org  <mailto:java-dev-unsubscribe@lucene.apache.org>
> For additional commands, e-mail:java-dev-help@lucene.apache.org  <mailto:java-dev-help@lucene.apache.org>
>   
>   
>    
>


Mime
View raw message