cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hunsberger, Peter" <>
Subject RE: External/Event Based Cache Invalidation (somewhat long)
Date Thu, 26 Jun 2003 18:24:50 GMT
Geoff Howard <> writes:
> Below is the larger picture I envision for a new kind of 
> cache invaliditation that I've needed in the past and comes 
> up in requests from people using EJB or database driven data 
> that is cacheable.  I'd love feedback from anyone who's 
> interested. 

As you know, we're interested.  However, we're in the middle of rolling
out our first release at the moment.  Until the bug reports slow down we
won't really have any time to look at this....

> That leaves only several choices
> - make the event know about what cache keys it needs to 
> remove.  This is the only solution currently available and it 
> has in practice meant hard coding the event-key relationship 
> somewhere and manually maintaining it. Not good IMHO.

This is essentially what we currently do.  It's a little generalized in
that we have specific data classes associated with specific Cocoon
generators and we've added an interface to those classes that specifies
a method that can be used to generate the cache key for any given data
item.  We then maintain our own hash map that maps the events to the
keys.  We don't remove the items from the Cocoon cache, rather we map to
the cache validity objects and call an "invalidate" method on them that
flips a flag to return invalid for the given object once an event
invalidation occurs.  The extra overhead of maintaining the extra hash
map isn't great, but not horrible either. We're currently missing a way
to have dependencies invalidated (but that could be aggregated
validities or another map from the pointers to the cache validity

It's worth noting that we sort of borrow on Sylvian's method of
retroactively updating cache validity objects: our objects start out
invalid in the generator setup and aren't marked valid until the actual
generate method has completed (you have to bootstrap things into the

> - search through every cached item on the receipt of an event 
> to see if it is now invalid in light of the current event.  
> Also not good.

No thanks, we considered this and rejected it...

> - Extend the cache to provide support for the cache-event 
> concept.  This is the tack I'm taking.  Essentially, this 
> solution involves the CacheImpl keeping track of mapping the 
> Event's declared in an EventValidity to the key under which 
> that response is stored in the cache.
> The "glue" that is missing is the 
> org.apache.cocoon.caching.impl.CacheImpl
> extension, because it won't compile without the change I made 
> to sourceresolve, which is not yet in cocoon's cvs.  For some 
> odd reason I'm having a hard time building the sourceresolve 
> package using its build script.  It's also not "done" as 
> noted below - but I'd love others to be able to work on it.
> Here are the issues the "event aware" CacheImpl would need to 
> take care of:
> - during store() it gets the SourceValidity[] from the 
> CachedResponse and looks for instanceof EventValidity 
> (recursively for AggregatedValidity).
> - if found, it calls EventValidity.getEvent() and stores the 
> key-event mapping.

Sounds good, essentially it's the same thing we're doing, but your way
Cocoon manages it for us...

> - expose a removeByEvent(Event e) method that can be called 
> by the specific event-handling component.  This could be a 
> jms listener (as I've orginally envisioned it) or an 
> http/soap based system (as in the ESI patch that was in
> bugzilla) or even a cocoon Action or Flow, or some 
> combination of all of the above.
> - When the key is ejected from the cache for other reasons 
> (another pipeline component signalled invalid for example) I 
> think it's necessary to at that moment remove the event-key 
> mapping entry.  This introduces a complication in the data 
> structure used to store these mappings as I mention below.  I 
> also haven't looked into the effect of the store janitor - if 
> it acts directly on the Store without going through the 
> CacheImpl wrapper, that introduces a wrinkle.
> Most of the above is accounted for - except for the data 
> structure to store the event-key mappings.  

I wondered when you where going to get to this...  

> As discussed 
> above, it needs to:
> - allow duplicate keys (each event may uncache multiple 
> pipelines, and each pipeline might be uncached by any of 
> multiple events).  So it needs a Bag.
> - allow lookup of mappings based on either event or key.  
> Jakarta Commons Collections has a DoubleOrderedMap, but not a 
> DoubleOrderedBag.  Bummer.
> - be persistent across shutdown and startup, and needs to 
> recover gracefully when the cache is missing (like when it's 
> been manually deleted)

Hmm, not so sure about this?  What are you envisioning here?

> - be efficient

Wouldn't two separate Maps (of Maps) also work?  More work to keep them
synced, but I think that's what you're going to end up building any way?

> I have made an assumption so far that I'd like tested by some 
> sharp minds. When a removeByEvent() is received, the 
> CacheImpl would do something like PipelineCacheKey[] 
> getByEvent() on its datastructure.  This would rely on
> hashCode() and equals() of Event() to locate relevant events. 
>  I think this works well for true "equals" type of 
> information: like "table_name" and "primary_key" -- if they 
> are both equal, the event has happened.  But there may be 
> some where a "greater than" or "less than" or worse yet, a 
> kind of wild card lookup might need to be supported. 

Well, as long as you can map events to keys many to many haven't you
already go this (though not in an automatic regex kind of way)?

> Can 
> that be accomodated by a Collections sort of implementation, 
> or does something more flexible need to be invented? As it 
> stands, you might implement hashCode() in a way that will 
> cause intentional collisions and rely on equals() to sort 
> things out.  Is that crazy?

Not sure, would almost need to see some pseudo code...

View raw message