commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Smuts" <>
Subject RE: [collections][PATCH] SequencedHashMap violates Map contract, has poor performance -- BufferCache, SequencedHastable, MRUMap and LRUMap .
Date Mon, 04 Feb 2002 03:31:12 GMT

> -----Original Message-----
> From: Michael Smith []
> Sent: Sunday, February 03, 2002 5:17 PM
> To: Jakarta Commons Developers List;
> Subject: RE: [collections][PATCH] SequencedHashMap violates Map
> has poor performance -- BufferCache, SequencedHastable, MRUMap and
> .
> > > ok.  I've looked over some of the classes you mentioned.  LRUMap
> > > commons.collections definately suffers the same problems.  I'm
> > > to post a patch for that class that enumerates some of the
> > > problems that I
> > > saw (I didn't fix them yet -- the patch just changes the license
> > > the proper long form and documents the problems).
> > >
> >
> > The main problem here is that the second most recently used
> > item can get
> > dropped.  It is not proper LRU.
> yup.  That's one of the things I mentioned in the documentation (in my
> patch).  I even mentioned your example.  :)

I just saw the patch.  Sounds good.

> > > Assuming you mean org.apache.turbine.util.BufferCache when you
> > referred
> > > to BufferCache, I looked it over as well.  It inherits from
> > > org.apache.turbine.util.SequencedHashtable which looks to be a
> > > exact duplicate of the SequencedHashtable that's in commons.util
> > > nearly the same as the SequencedHashMap that exists in
> > > commons.collections (which is the one I submitted my patch
> > >
> > > I can't find any reference to MRUMap.  Can you give me a pointer?
> > >
> >
> > It's in the commons sandbox(?) simplestore.
> doh.  I don't know how I missed that.
> Yes, MRUMap does the same thing.  Linked list; O(n) retrieval, O(n)
> removal.  Kind of defeats the purpose of using a map at all, doesn't
> :)

Sure does.  It wasn't noticed since the default size was set at 100.  

I'm looking for new memory management algorithms for JCS.  The LRU I'm
using, which is basically the LRUStore I added, is good, but I'd like
something more efficient for huge collections of objects.  Basically,
every get results in a reshuffling of some retrieved item to the front,
unless the item is already the front runner.  Since the likelihood of
the retrieved item already being the first it small, I'd like to find
something even more efficient.  

The cache keeps pushing items to the front.  When new items arrive they
are pushed off to disk in a group of say 5.  The memory cache plugs into
the cache hub.  The hub has a bunch of auxiliary caches plugged in and
manages expiration and the transmission of items to the auxiliaries
based on the element and regional attributes of the cache.  If one of
the auxiliaries is of the disk type, the overflow from the memory cache
(the LRU) is put to disk.  These puts are queued in a very efficient
manner, first going to a purgatory and then to disk (take a look).

I'd like to avoid the amount of shuffling of memory elements somehow.
Perhaps by keeping an integer tally of the access and having a
background thread handling the overflow, so the retrievals are almost at
map time.  An algorithm based on the current access number and the total
accesses vs the total number of items could be used.  This may result in
a few less operations in the synchronous get operation.  

Since this isn't a bottleneck, I'm not too concerned with it, but it is
an interesting problem that someone might want to undertake. 

Good luck with your patches.


> regards,
> michael
> --
> To unsubscribe, e-mail:   <mailto:commons-dev-
> For additional commands, e-mail: <mailto:commons-dev-

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message