lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: Performance of hit highlighting and finding term positions for
Date Wed, 31 Mar 2004 21:54:23 GMT wrote:
> As a note of warning: I did find StandardTokenizer to be the major culprit in my tokenizing
benchmarks (avg 75ms for 16k sized docs).
> I have found I can live without StandardTokenizer in my apps.

FYI, the message with Mark's timings can be found at:

According to these, if your documents average 16k, then a 10-hit result 
page would require just 66ms to generate highlights using SimpleAnalyzer.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message