lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Patterson" <>
Subject Re: Caching of TermDocs
Date Tue, 27 Jul 2004 15:05:30 GMT
Cool.  I'll give it a try.  Looks like extending FilterIndexReader is the
way to go.  Or possibly I could cache the compressed form at a lower level
getting the best of both worlds.  I'll look into both ways, profile the app,
and post my results.

----- Original Message ----- 
From: "Doug Cutting" <>
To: "Lucene Users List" <>
Sent: Tuesday, July 27, 2004 8:33 PM
Subject: Re: Caching of TermDocs

> John Patterson wrote:
> > I would like to hold a significant amount of the index in memory but use
> > disk index as a spill over.  Obviously the best situation is to hold in
> > memory only the information that is likely to be used again soon.  It
> > that caching TermDocs would allow popular search terms to be searched
> > efficiently while the less common terms would need to be read from disk.
> The operating system already caches recent disk i/o.  So what you'd save
> primarily would be the overhead of parsing the data.  However the parsed
> form, a sequence of docNo and freq ints, is nearly eight times as large
> as its compressed size in the index.  So your cache would consume a lot
> of memory.
> Whether it this provide much overall speedup depends on the distribution
> of common terms in your query traffic.  If you have a few terms that are
> searched very frequently then it might pay off.  In my experience with
> general-purpose search engines this is not usually the case: folks seem
> to use rarer words in queries than they do in ordinary text.  But in
> some search applications perhaps the traffic is more skewed.  Only some
> experiments would tell for sure.
> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message