lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Patterson" <dev_johnpatter...@hotmail.com>
Subject Re: Caching of TermDocs
Date Tue, 27 Jul 2004 15:05:30 GMT
Cool.  I'll give it a try.  Looks like extending FilterIndexReader is the
way to go.  Or possibly I could cache the compressed form at a lower level
getting the best of both worlds.  I'll look into both ways, profile the app,
and post my results.

----- Original Message ----- 
From: "Doug Cutting" <cutting@apache.org>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Tuesday, July 27, 2004 8:33 PM
Subject: Re: Caching of TermDocs


> John Patterson wrote:
> > I would like to hold a significant amount of the index in memory but use
the
> > disk index as a spill over.  Obviously the best situation is to hold in
> > memory only the information that is likely to be used again soon.  It
seems
> > that caching TermDocs would allow popular search terms to be searched
more
> > efficiently while the less common terms would need to be read from disk.
>
> The operating system already caches recent disk i/o.  So what you'd save
> primarily would be the overhead of parsing the data.  However the parsed
> form, a sequence of docNo and freq ints, is nearly eight times as large
> as its compressed size in the index.  So your cache would consume a lot
> of memory.
>
> Whether it this provide much overall speedup depends on the distribution
> of common terms in your query traffic.  If you have a few terms that are
> searched very frequently then it might pay off.  In my experience with
> general-purpose search engines this is not usually the case: folks seem
> to use rarer words in queries than they do in ordinary text.  But in
> some search applications perhaps the traffic is more skewed.  Only some
> experiments would tell for sure.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message