cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [RT] sharing latest research production (long!)
Date Tue, 06 Mar 2001 09:03:47 GMT
Giacomo Pati wrote:
> 
> Stefano Mazzocchi wrote:
> 
> > > Paul Lamb wrote:
> > > I've read you RT twice now and with your comments to Robin it does make
> > > sense. It brought back a real deja vu from over a decade ago when I sat
> > > through an entire day on the design and implementation of the scheduler,
> > > paging, MMU and TLD of the RS/6000 right after it first came out. Very
> > > similar thought processes.
> >
> > I thank you for this. It's a very good compliment.
> >
> > > My first thought on the formulas is what about when the first retrieval
> > > is hugely expensive, but after that it's not at all. I'm not sure where
> > > you'd put information like ignore the first x number of accesses.
> >
> > Oh, no, nothing hugely expensive, the only thing that happens is that
> > the frequency of caching will depend on the efficiency, thus, the more
> > efficient, the least number of hits to reach nearly optimal caching
> > performance, the least efficient, the more hits, but since efficiency
> > was slow anyway, the net result is that no huge expense is taken.
> >
> > Anyway, I plan to write a small "visual fake cache" to show you how the
> > cache works, something like a JMeter that estimates efficiency and
> > visualizes cache entries and level of efficiencies and you can modify
> > parameters at real time to show how it adapts.
> >
> > I know, it's a toy, but it could give interesting views into the best
> > way to visualize efficiency information.
> >
> > > >From a real-world, non theoretical, perspective the one that worries me
> > >
> > > is:
> > > >       +----------------------------------------------------------+
> > > >
> > > >       | Result #2:                                               |
> > > >       |
> > > >       | Each cacheable producer must generate the unique key of  |
> > > >       | the resource given all the enviornment information at    |
> > > >       | request time                                             |
> > > >       |
> > > >       |   long generateKey(Enviornment e);                       |
> > > >
> > > >       +----------------------------------------------------------+
> > >
> > > Here's the problem I see. I create a hash function that works great, the
> > > code goes into production for a year and then some really important
> > > person decides that there needs to be a change. The changes are
> > > relatively easily made to the producer but nobody thinks to update the
> > > hashing function. Now I've introduced the possibility that the wrong
> > > data will be pulled from the cache and delivered to someone. I'd really
> > > hate to try and track down a bug like this.
> > >
> > > Secondly, hash functions themselves. For the programmer that's never
> > > done one before they can seem rather foreign; I'd wager that most have
> > > never even seen one, and very few have had to code one. And from
> > > experience it can require lots of testing to make sure it's 100%
> > > correct.
> > >
> > > Am I missing something here? Is this a lot easier than I think?
> >
> > No, you are totally right, from this perspective, it's a real pain in
> > the ass.
> >
> > But luckily, Ricardo and I thought about a solution for this problem
> > (yes, I explained part of this RT to Ricardo weeks ago): re-inversion of
> > control.
> >
> > Look at these interface:
> >
> >  public interface Cacheable {
> >    public void fillPolicy(CachePolicy policy);
> >  }
> >
> >  public interface CachePolicy {
> >     public void addDependency(String variableName, Object
> > variableValue);
> >     public void setTimeToLive(long seconds);
> >  }
> >
> > The only thing you have to provide is the dependencies you have
> > (filename, cookie parameter value, time of the day, system load, etc)
> > and, if you have it, your time2live.
> >
> > The cache will then generate the key for you based on all the
> > information you provided.
> >
> > Isn't is smart? :)
> 
> Well, we thought is is a little bit overdesigned so here is our smart
> solution :-)) we (Daniel and I) developed (I've announced this prior to my
> vacations)
> 
> Imagine a CachableXMLProducer implements the following interface
> 
>   public interface CachableXMLProducer {
>     Object getKey()
>     CacheValidity getValidity();
>   }
> 
> The getKey method returns an object, that uniquely identifies the universe
> the XMLProducer is in. For a FileGenerator its probabbly the file it is
> parsing. A TraxTransformer returns the name of its stylesheet. Other Producer
> return appropriate Objects.
> 
> Take into accound that the CacheManager will use this Object together with
> the type of Producer to get a unique Key into its CacheStore.
> 
> The getValidity returns a CacheValidity objects that has a single isValid()
> method (more of this later on).
> 
> The process of determining if a ChacheEntry is still valid goes like this.
> 
> The CacheManager asks the CachableXMLProducer to return its key (getKey()).
> This key together with the type of XMLProducer the key came from is used as
> index into a ValidityStore which holds CacheValidity objects gotten so far.
> If there is no CacheValidity object corresponding no CacheEntry has been
> taken so far. If there was a stored CacheValidity object it is passed to the
> CacheValidity object obtained by a call of the getValidity method of the
> CachableXMLProducer:
> 
>    Object key = cachableXMLProducer.getKey();
>    if ((validity = validityStore.get(key, cachableXMLProducer)) != null) {
>        newValidity = CachableXMLProducer.getValidity();
>        if (newValidity.isValid(validity)) {
>            // previous CacheEntry is still valid
>        } else {
>            // previous CacheEntry is invalid
>        }
> 
> Now lets talk about the CacheValidity object.
> 
>   public interface CacheValidity {
>     boolean isValid (CacheValidity validity);
>   }
> 
> Important is, that the isValid() method of a CacheValidity object can only
> compare to exactly the same type of CacheValidity object. The algorithm used
> by the cache manager above should ensure that only the same type of
> CacheValidity objects are used. This means also that a specific type of a
> CachableXMLProducer can only generate the exact same type of cachevalidity
> object all the times. The design with the CachePolicy of Ricardo and Stefano
> allows this but we think it is not a good idea.
> 
> Almost 80% of them will be TimeStampValidity objects as in the case of the
> FileGenerator or TraxTransformer. So a concrete TimeStampValidity class will
> take care of these type of validities.
> 
> You can think of all kinds of validities you need.
> 
> Additionally you can have a ValidityContainer class which can hold several
> Validity objects and return an ANDed value of all the Validity objects as
> well as an OrValidityContainer which ORes those validation results.
> 
> We thought this design is very simple to implement and fast as well. It also
> allows every boolean constellation to be implemented to check the validity.

Your solution doesn't address one problem, tought: key generation is
left to the component implementor and might result in slow and memory
consuming implementations.

This is why, in our solution, we re-invert the control: you tell us what
you depend on and we create the key for you.

But I resonate with the design paradigm of your CacheValidity object....
hmmm, maybe the two things can be merged together?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



Mime
View raw message