lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Engels" <>
Subject caching term information?
Date Thu, 18 May 2006 17:43:08 GMT
Has anyone thought of (or implemented) caching of term information?

Currently, Lucene stores an index of every nTH term. Then uses this
information to position the TermEnum, and then scans the terms.

Might it be better to read a "page" of term infos (based on the index), and
then keep these pages in a SoftCache in the SegmentTermEnum ?

It seems the byte/char level processing is what consumes the most CPU
performing searches. This should reduce that dramatically, especially for
common term scans.

I realize there are better ways for some common scans (range filters, etc.)
that would avoid the overhead completely.

Any thoughts?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message