cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Unico Hommes" <Un...@hippo.nl>
Subject RE: External/Event Based Cache Invalidation (somewhat long)
Date Sun, 29 Jun 2003 10:35:15 GMT

I can't believe I've missed this post. Damn.

> 
> Below is the larger picture I envision for a new kind of cache
> invaliditation
> that I've needed in the past and comes up in requests from people
using
> EJB
> or
> database driven data that is cacheable.  I'd love feedback from anyone
> who's
> interested.  I've BCC'd a few people who I've talked with about this
in
> the
> past.
> 
>
------------------------------------------------------------------------
--
> --
> 
> I've committed some little things into the cocoon scratchpad - part of
> some
> experimental work for external cache invalidation, which as I've
thought
> about it is really "event" based cache invalidation.
> 
> Why? Because the events need not be external to Cocoon, and when you
think
> about it, all cache validities are "external" in that the Cache
validity
> object needs to consult some external reference - the filesystem, the
> system
> time, etc.
> 
> But event based invalidation has some differences in the way they need
to
> be
> handled.  The other validities can reasonably be expected to check
with
> their
> external resources (time, filesystem, etc) when retrieved from cache
and
> isValid() is called.  But with events, the subsystem which received
the
> events
> would need to store them all up waiting for the call to isValid()
which
> depending on other factors might never come.  It seems to me more
fitting
> with the transient nature of events to act on them when they arrive
and
> then
> discard them.

That would definitely be the way to go.

> 
> That leaves only several choices
> - make the event know about what cache keys it needs to remove.  This
is
> the only solution currently available and it has in practice meant
hard
> coding the event-key relationship somewhere and manually maintaining
it.
> Not good IMHO.
> - search through every cached item on the receipt of an event to see
if
> it is now invalid in light of the current event.  Also not good.

This is the approach I've taken in the past and it means that to a
lesser extent than the former choice - where the event knows about cache
keys - the event must know about Cocoon and the way its sources are
identified.

> - Extend the cache to provide support for the cache-event concept.
This
> is
> the tack I'm taking.  Essentially, this solution involves the
CacheImpl
> keeping track of mapping the Event's declared in an EventValidity to
the
> key under which that response is stored in the cache.
> 
> The "glue" that is missing is the
org.apache.cocoon.caching.impl.CacheImpl
> extension, because it won't compile without the change I made to
> sourceresolve,
> which is not yet in cocoon's cvs.  For some odd reason I'm having a
hard
> time
> building the sourceresolve package using its build script.  It's also
not
> "done" as noted below - but I'd love others to be able to work on it.
> 
> Here are the issues the "event aware" CacheImpl would need to take
care
> of:
> - during store() it gets the SourceValidity[] from the CachedResponse
and
> looks for instanceof EventValidity (recursively for
AggregatedValidity).
> - if found, it calls EventValidity.getEvent() and stores the key-event
> mapping.
> - expose a removeByEvent(Event e) method that can be called by the
> specific
> event-handling component.  This could be a jms listener (as I've
orginally
> envisioned it) or an http/soap based system (as in the ESI patch that
was
> in
> bugzilla) or even a cocoon Action or Flow, or some combination of all
of
> the
> above.
> - When the key is ejected from the cache for other reasons (another
> pipeline
> component signalled invalid for example) I think it's necessary to at
that
> moment remove the event-key mapping entry.  This introduces a
complication
> in the data structure used to store these mappings as I mention below.
I
> also haven't looked into the effect of the store janitor - if it acts
> directly
> on the Store without going through the CacheImpl wrapper, that
introduces
> a
> wrinkle.

Hmm, that does seem to be the case.

> 
> Most of the above is accounted for - except for the data structure
> to store the event-key mappings.  As discussed above, it needs to:
> - allow duplicate keys (each event may uncache multiple pipelines, and
> each
> pipeline might be uncached by any of multiple events).  So it needs a
Bag.
> - allow lookup of mappings based on either event or key.  Jakarta
Commons
> Collections has a DoubleOrderedMap, but not a DoubleOrderedBag.
Bummer.
> - be persistent across shutdown and startup, and needs to recover
> gracefully
> when the cache is missing (like when it's been manually deleted)
> - be efficient
> 
> I have made an assumption so far that I'd like tested by some sharp
minds.
> When a removeByEvent() is received, the CacheImpl would do something
like
> PipelineCacheKey[] getByEvent() on its datastructure.  This would rely
on
> hashCode() and equals() of Event() to locate relevant events.  I think
> this
> works well for true "equals" type of information: like "table_name"
and
> "primary_key" -- if they are both equal, the event has happened.  But
> there
> may be some where a "greater than" or "less than" or worse yet, a kind
of
> wild card lookup might need to be supported.  Can that be accomodated
by a
> Collections sort of implementation, or does something more flexible
need
> to be invented? As it stands, you might implement hashCode() in a
> way that will cause intentional collisions and rely on equals() to
sort
> things out.  Is that crazy?
> 

I think this would be difficult and will impact performance because of
grouping of multiple keys under the same hash code. Consider wildcard
patterns and you'd like to invalidate **. In order for this to work, all
keys must return the exact same hash code. The same situation occurs
with hierarchy matching, i.e. if you want /path/to to match
/path/to/my/source (for instance in a directory generator that relies on
all sources under a certain context directory). In this case / matches
everything else too.

I actually experienced this last situation. What I did was to generate
apart from the original event, events for all ancestor paths as well.

> Geoff
> 
> > -----Original Message-----
> > From: ghoward@apache.org [mailto:ghoward@apache.org]
> > Sent: Friday, June 20, 2003 11:36 PM
> > To: cocoon-2.1-cvs@apache.org
> > Subject: cvs commit:
> > cocoon-2.1/src/scratchpad/src/org/apache/cocoon/caching/validity
> > NameValueEvent.java Event.java EventValidity.java NamedEvent.java
> >
> >
> > ghoward     2003/06/20 20:36:15
> >
> >   Added:       src/scratchpad/src/org/apache/cocoon/caching/validity
> >                         NameValueEvent.java Event.java
> EventValidity.java
> >                         NamedEvent.java
> >   Log:
> >   Experiment with external cache invalidation.  An
EventAwareCacheImpl
> >   can't be committed yet because it relies on the latest
sourceresolve
> >   cvs.

So let's update! I'm really eager to see this stuff and get my hands
dirty. What are the difficulties are you experiencing with building
sourceresolve, can I help?

Regards,
Unico

Mime
View raw message