cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berin Loritsch <blorit...@apache.org>
Subject Re: [RT] Adaptive Caching
Date Wed, 16 Jul 2003 12:38:11 GMT
Stefano Mazzocchi wrote:

> 
> On Tuesday, Jul 15, 2003, at 17:06 America/Guayaquil, Berin Loritsch wrote:
> 
>> Dumbing it down a lot will help emensely.  The problem is that it is 
>> already 18+ pages of hard to read stuff.  Then when we don't 
>> understand it we get verbally flogged.
> 
> 
> Berin, look: if you said "I don't get it, please rephrase" I would have 
> done it. If you said "I don't get this part" or "why are you doing 
> this?" I would have elaborated.

Here is the deal.  When I try to wrap my brain around something, I try to
rephrase things in terms and concepts that I understand.  I have studied
AI theory for a program that was being developed by my last company.  I
can "get" that because it works more cognitively for me.

So the response I get is a lot of flak.  Quite honestly, I simply state
things in terms that I have both observed and have experience with.  My
statements about the limited value (cost/benefit ratio) of partial pipeline
caching has to do with *my* experience.  Maybe others have had different
experiences, but all my dynamic information was always encapsulated in
one place: the generator.  The transformers had an infinite ergodic
period (infinite in that they did not change until there was a development
reason to do so).  The serializers were quite simple.

With that combination facts, the only thing in my pipeline with an
ergodic period less than infinity was the generator.  For my static
pages, I could have simply compiled the results and served them.  For
my dynamic pages, the generation was pretty darn quick.

Then again, maybe my *experience* is not typical, which means what is
good enough for me is not good enough for others.

> 
>> Sorry I poopooed on your idea.  I'm sorry I don't have the mental 
>> capacity
>> to constructively contribute.
> 
> 
> Oh, please, you are way smarter than enough to get it. Maybe the 
> language is not what you are used to, granted, but I'm wide open to try 
> to outline the details. If you tell me what you don't understand I can 
> try explain it differently. If you don't, and just go on assuming, I 
> can't do anything but being frustrated at all the time I spent on this.


That was in response to your email.  By the time I was done reading it,
I felt like crap.  There are more similarities between what I presented
and you presented than you would like to admit.

The basic crux of programming Intelligent Agents is that agents react
to the state of the environment, discover the best response, and act
(usually in a way that affects the environment).

What you outlined was the "search algorithm" (the discovery method),
and the maner in which the agent (cache controller) acted upon the
environment.

What you left as an excersize to the reader was the Cost function.  The
rules-based approach is one way of achieving that cost function declaratively.
Any time you incorporate conditional logic in your cost function, you
have explicit rules.  Any time you have a weighted cost that factors in
different elements you have implicit rules.  Explicit rules are easier to
debug and understand, also to predict.  There are different ways of applying
explicit rules without resorting to if/then/else or switch/case functions.
In fact you might be able to come up with a way to translate explicit rules
into a function with implicit rules.

Truth be told, your search algorithm might be as efficient as they come.
It might not.

But do keep in mind the poor administrator/installer.  The guy who is
managing the install is more interested in the efficiency of the cache
and how Cocoon uses resources than the developer.  The user in many cases
won't know the difference because their network connection is throttling
the response, or their browser is the bottleneck.  The efficiency of the
cache will likely have the most impact on the scalability of the server.
That is my OPINION, and should not be taken as gospel.  Please do not
shoot me for holding that opinion.

I took the 18 pages home and started to play with some code.  I appreciate
the journey to get to the search algorithm you proposed.  The functions
as declared in the beginning were not that efficient, and on my machine
as soon as you had more than ~410 samples for the invalid/valid cost
evaluations the function took ~10ms (the granularity of my clock) to
evaluate.  The evaluation used constants for all the cost figures because
that would be the computationally cheapest way to evaluate the efficiency
of those functions.

After this exercise, to my dismay, was the better way of evaluating a
continual cost value.  That way instead of having a sample for each
request at a particular time, we only had to work with three samples.
A vastly improved search algorithm in terms of being computationally
cheap.

Before it was unclear what type would best represent the cost function,
but when you introduced the trigonometric functions into the mix, it was
clear that floating point precision was required.

Perhaps seeing what you want in code would be the best thing.  It would
help solidify things for us that don't do the math or are overwhelmed by
a very long whitepaper and trying to derive from it how it will practically
make our lives better.  It is would help determine the cost/benefit ratio
of actually developing this thing.  As it stands, any idea that takes 18
pages to explain gives the impression of a very high cost.  Whether this
bears out in practice or not remains to be seen.

-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin


Mime
View raw message