lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: who clears attributes?
Date Tue, 11 Aug 2009 11:09:05 GMT
On Tue, Aug 11, 2009 at 6:50 AM, Robert Muir<> wrote:
> On Tue, Aug 11, 2009 at 4:28 AM, Michael Busch<> wrote:
>> There was a performance test in Solr that apparently ran much slower
>> after upgrading to the new Lucene jar. This test is testing a rather
>> uncommon scenario: very very short documents.
> Actually, its more uncommon than that: its very very short documents,
> without implementing reusableTokenStream()
> this makes it basically a benchmark of ctor cost... doesn't really
> benchmark the token api in my opinion.

You would be surprized... there are quite a few Solr users that have
relatively short documents... or even if they are sizeable documents,
they have up to hundreds of short metadata-type fields (generally a
token or two).

Reusing TokenStreams has become a must in Solr IMO since construction
costs (hashmap lookups, etc) and GC costs (larger objects) have been
growing.  I'm focused on that now...

Robert's taking a crack at fixing things up so users can actually
create reusable analyzers out of our filters:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message