lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <yo...@lucidimagination.com>
Subject Re: indexing slowdown with latest lucene udpate
Date Sun, 09 Aug 2009 18:02:31 GMT
It looks like implementing the new attribute stuff will not be enough
- the token architecture has changed enough that it looks like we must
cache tokenstreams to get back to good performance.

-Yonik
http://www.lucidimagination.com


On Sun, Aug 9, 2009 at 12:57 PM, Yonik Seeley<yonik@lucidimagination.com> wrote:
> OK, I've isolated (magnified) the effect with a test I just checked in.
> Indexing documents directly at the UpdateHandler was 85% faster before
> the latest lucene update.
>
> Run the test like this:
>
> ant test -Dtestcase=TestIndexingPerformance -Dargs="-server
> -Diter=100000"; grep throughput
> build/test-results/*TestIndexingPerformance.xml
>
> To run on an older trunk version, just copy over
> src/test/org/apache/solr/update/TestIndexingPerformance.java
> src/test/test-files/solr/conf/solrconfig_perf.xml
>
> I had a throughput of 10946 docs/sec before the lucene update, and 5849 after.
>
> -Yonik
> http://www.lucidimagination.com
>
>
> On Sun, Aug 9, 2009 at 12:10 PM, Yonik Seeley<yonik@lucidimagination.com> wrote:
>> On Sun, Aug 9, 2009 at 12:01 PM, Grant Ingersoll<gsingers@apache.org> wrote:
>>> Or bite the bullet and upgrade to the incrementToken() method.
>>
>> Right - I'm not sure if that would fix it or not - I haven't been
>> involved in the new Token attribute stuff...
>> I'm currently writing a basic indexing unit test that we can use to
>> measure this (the standard solrconfig does stuff that slows down
>> indexing a lot, but helps in catching bugs on edge cases by creating
>> many segments).
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>

Mime
View raw message