jakarta-jcs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Smuts <asm...@yahoo.com>
Subject Re: High Data Volume Issues
Date Tue, 18 Jul 2006 17:08:06 GMT
100GB is a tremendous amount of data to cache.  I'm
caching millions of items using the MySQL disk cache,
but the items are mostly under 10k and I don't
typically go over 2 gb.  

How many items do you expect?  For regions with tens
of thousands of items, I do not use the indexed disk
cache.  It keeps the keys and the file offset in
memory, so it is not suitable for lots of items, but
it can handle fewer large items.  The size of the
items is not as important as the number of the indexed
disk cache's memory usage.

If the memory size is set to 0, the items will not be
respooled to disk.  We could do a bit more to ensure
that the item was not already present on disk.

The optimization routine is fairly crude.   I use the
MySQL disk cache for regions where I will be deleting
a lot because of this.  However, I would like to
improve the optimizaition for the indexed disk cache.


--- "Schwarz, Peter" <peter.schwarz@objectfx.com>

> We are evaluating your caching library for a high
> read/really large cache
> situation (on the order of 100GB).  We've come
> across several issues that
> we'd like to discuss further with you before we
> proceed with patching the
> source on our end.  
> The first issue is pretty small (from a code change
> perspective) but effects
> the number of writes to the disk.  After loading the
> objects to the cache (2
> million 16K objects in our tests), we proceed to
> read them back (mostly for
> performance metrics - read/write times, small cache
> vs large cache, etc).
> What we found was that for simple reads (no
> modification of the objects) the
> disk time was doubled due to writes of the elements
> falling off memory
> cache.  These elements are being spooled back to
> disk.  There's no need for
> these to go back to disk, since they are unmodified
> and already on disk in
> the first place. 
> The second issue is with the file optimization.  The
> method that you are
> using doubles the disk space needed for the cache.
> Given that were talking
> about a 100 gig cache file, this is pretty
> expensive.  Secondly, it seems to
> be pretty memory intensive as well.  Our tests seem
> to run out of memory
> during the optimization process, running at 512 for
> max heap size
> (admittedly this seems a bit small given our data
> requirements but the
> writing and reading process is fine).  It seems like
> this could be done a
> little more efficiently.
> I'm planning on opening bugs in Jira on these
> issues, as well, just get them
> in your system.  I would like to start a dialog
> first, given that our usage
> pattern is for very high volume data caches.  
> Thanks, 
> Peter Schwarz
> ObjectFX, Inc
> To unsubscribe, e-mail:
> jcs-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> jcs-dev-help@jakarta.apache.org

To unsubscribe, e-mail: jcs-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jcs-dev-help@jakarta.apache.org

View raw message