db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gokul Soundararajan" <gokul.soun...@gmail.com>
Subject Re: Google SoC: Derby Cache Manager
Date Sat, 10 Jun 2006 17:31:08 GMT
Thanks to all who commented on my early results. I have added the
results of a mixed workload containing both Zipf and Scans. I followed
the example provided in the 2Q paper in which they tried out scans of
different lengths. I used a 10000 item cache with 100000 item dataset
and ran a mixed workload of 0.8 Zipf with 33% scans of lengths (0, 10,
100, 1000, 10000). The resulting graph is available as PNG and PDF.
The results show that there is a significant benefit by using the CART

Earlier, I was leaning towards using the 2Q algorithm but I found out
that it has a significant synchronization penalty. The Postgres
community implemented the 2Q algorithm (in 8.0) when they found out
that ARC was patented by IBM. Since then, they have gone to Clock (in
8.1) mostly due to the contention penalty in 2Q. Since CART is a
cousin of Clock, it may have less overhead.

See the wiki for the graphs.
Link: http://wiki.apache.org/db-derby/DerbyLruCacheManager



On 6/5/06, Manish Khettry <manish.khettry@gmail.com> wrote:
> From the graphs it does look like we can gain a little performance by
> switching the cache replacement policy.  A few thoughts on the wiki page and
> the work you've done so far.
> Would it be possible to have pointers to papers that describe these
> algorithms on the wiki page? I could find CAR (
> http://www.almaden.ibm.com/cs/people/dmodha/clockfast.pdf)
> and 2Q (http://www.vldb.org/conf/1994/P439.PDF ) but not
> From looking at the graphs, it looks like CART and 2Q are the two choices
> really. However running a few more experiments may not be a bad idea. Knut
> mentions adding scans to the zipf mix and that may be worthwhile. Actually
> the original 2Q paper does mix scans of different lengths with one of the
> zipf workload. Also given that the two seem to converge around 10K cache
> size (or 10% of the dataset size), can we go upto 20 or 40% and see if one
> policy is measurably better than the other?
> If there isn't a whole lot of difference between the performance then it
> makes sense to go with the simpler implementation and/or consider
> synchronization concerns-- I'm guessing that 2Q (based on LRU) will involve
> more synchronization than CART (Clock based).
> Those papers and the your results were pretty interesting reading. I too am
> curious about the zig-zag pattern on the zipf 0.8 dataset.
> Thanks,
> Manish
> On 6/4/06, Gokul Soundararajan <gokul.soundar@gmail.com> wrote:
> > Hi all,
> >
> > I'm a Google SoC student working on creating a CacheManager with a
> > better cache replacement algorithm. After talking with my mentor
> > (Oystein Grovlen), I have created a bunch of small prototypes to
> > evaluate different algorithms found in literature.
> >
> > I have run them for a Zipf workload for 0.5 and 0.8 and posted the
> > results on the Derby Wiki.
> > Link:
> http://wiki.apache.org/db-derby/DerbyLruCacheManager
> > The results show that the 2Q, CAR, and CART algorithms perform better
> > than LRU and Clock.
> >
> > I would like to get some feedback on my results to help me decide what
> > I should implement in Derby.
> >
> > Thanks,
> >
> > Gokul
> >

View raw message