incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Wellnhofer <wellnho...@aevum.de>
Subject [lucy-dev] Some quick benchmarks
Date Wed, 07 Dec 2011 21:42:57 GMT
Some quick and completely unscientific benchmarks, indexing 1000 times 
the same 10K ASCII document:

RT = RegexTokenizer
ST = StandardTokenizer
CF = CaseFolder
N  = Normalizer

RT:    2.177s
RT+CF: 3.964s
RT+N:  2.556s
ST:    1.551s
ST+CF: 3.357s
ST+N:  1.931s

It's also interesting that moving the tokenizer in front of the case 
folder or normalizer always gave me faster results.

Nick

Mime
View raw message