lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: indexing_slowdown_with_latest_lucene_udpate
Date Mon, 10 Aug 2009 14:43:19 GMT
This is real and not just for very short docs. The reflection overhead
is pretty expensive I think.
here are some stats from the hamshari corpus (i have been trec testing
persian just to ensure everything is ok)

SimpleAnalyzer: (has reusableTokenStream)
Total time: 47816 ms
Unique tokens: 441660

PersianAnalyzer (no reuse):
Total time: 53928 ms
Unique tokens: 438286

PersianAnalyzer (with reusableTokenStream)
Total time: 47704 ms
Unique tokens: 438286

On Mon, Aug 10, 2009 at 10:35 AM, Mark Miller<markrmiller@gmail.com> wrote:
> Discussion on speed of new TokenStream API in Solr.
>
> see:
> http://search.lucidimagination.com/search/document/d0040ebe6addad4b/indexing_slowdown_with_latest_lucene_udpate
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message