lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Proposal: extracting term-level stats from query process
Date Thu, 11 Mar 2004 23:29:17 GMT
I just re-ran the same tests but using SimpleAnalyzer (a lowercase filter only)

This time round responses were :
Tokenizing:5 ms avg per doc
Highlighting:11 ms avg per doc
RAM Indexing docs:39 ms avg per doc

RAM indexing still looks to add more than I would like.

Having reviewed my previous choice of analyzer the main offender in it's list of filters looks
to be "StandardTokenizer".
On its own it clocks up an avg 73 ms per doc.

To be honest at first glance I dont know what it is trying to do - its JavaCC generated code
and its not immediately obvious to me.
I do see its using Vectors internally so thats not going to help matters.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message