lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <>
Subject Re: who clears attributes?
Date Tue, 11 Aug 2009 11:21:39 GMT
On Tue, Aug 11, 2009 at 15:09, Yonik Seeley<> wrote:
> On Tue, Aug 11, 2009 at 6:50 AM, Robert Muir<> wrote:
>> On Tue, Aug 11, 2009 at 4:28 AM, Michael Busch<> wrote:
>>> There was a performance test in Solr that apparently ran much slower
>>> after upgrading to the new Lucene jar. This test is testing a rather
>>> uncommon scenario: very very short documents.
>> Actually, its more uncommon than that: its very very short documents,
>> without implementing reusableTokenStream()
>> this makes it basically a benchmark of ctor cost... doesn't really
>> benchmark the token api in my opinion.
> You would be surprized... there are quite a few Solr users that have
> relatively short documents... or even if they are sizeable documents,
> they have up to hundreds of short metadata-type fields (generally a
> token or two).
> Reusing TokenStreams has become a must in Solr IMO since construction
> costs (hashmap lookups, etc) and GC costs (larger objects) have been
> growing.  I'm focused on that now...
> Robert's taking a crack at fixing things up so users can actually
> create reusable analyzers out of our filters:

+1. We don't use Solr, but have quite a bunch of medium and
short-sized documents. Plus heaps of metadata fields.

I'm yet to read Uwe's example, but I feel I'm a bit misunderstood by
some of you. My gripe with new API is not that it brings us troubles
(which are solved one way or another), it is that the switch and
associated migration costs bring zero benefits in immediate and remote
The only person that tried to disprove this claim is Uwe. Others
either say "the problems are solved, so it's okay to move to the new
API", or "this will be usable when flexindexing arrives". Sorry, the
last phrase doesn't hold its place, this API is orthogonal to
flexindexing, or at least nobody has shown the opposite.
So, what I'm arguing against is adding some code (and forcing users to
migrate) just because we can, with no other reasons.

Kirill Zakharenko/Кирилл Захаренко (
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message