lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Stoppelman" <stop...@gmail.com>
Subject StandardTokenizer is slowing down highlighting a lot
Date Thu, 19 Jul 2007 00:28:44 GMT
Hi all,

I was tracking down slowness in the contrib highlighter code and it seems
the seemingly simple tokenStream.next() is the culprit.
I've seen multiple posts about this being a possible cause. Has anyone
looked into how to speed up StandardTokenizer? For my
documents it's taking about 70ms per document that's a big ugh! I was
thinking I might just cache the TermVectors in memory if
that will be faster. Anyone have another approach to solving this problem?

-M

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message