cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berin Loritsch <blorit...@apache.org>
Subject [RT] Managing Flow and Resources
Date Fri, 07 Dec 2001 21:05:27 GMT
There has been a lot of talk about Finite State Machine approaches, Staged Event
Driven Architecture (a refinement of the same), Scheme, and more.  When I saw the
post regarding Scheme it jogged my memory (a few days later).  When I was first
working for my company, we were developing a really cool tool that would automagically
create schematics (product category to shelf mapping) that follow the rules for
an organization so that a master schematic can be applied to a group of different
stores based on demographic information.  Since that is a mouthful, I will elaborate.
Retailers want to be able to map a bunch of products (organized by category) to
the shelves of their store.  They are well aware of the fact that different
products sell at different rates based on the demographic information of a store.
In other words a store in an area where 90% of the people make $100,000 or more
a year will consume 3 times as much Haagen Das ice cream than an area where only
10% of the people make that much.  This is known as corner-store marketing.  I.e.
getting the specialization of a corner store from a major retailer.  The problem
with that is that you can have as many as 1000 different schematics for _each_
category if a division has 1000 different stores.  This is nearly impossible to
manage by humans.  Our Automatic Schematic Generation tool handles the specializations
based on a rules-based engine.

"So what does this have to do with Cocoon?" I hear you ask.  Good question!  One
of the tools we used to perform these very complex decisions was based on an open
source project called the Java Expert System Shell (JESS) which can be found at
http://herzberg.ca.sandia.gov/jess/.  The concept of JESS is very powerful.  The
reason why Scheme jogged my memmory is because the JESS language is based on
Scheme--but only because the expert systems community is more familiar with it.

The over-simplified summary of the approach is that in a Rules-based system, you
have a set of Facts and a set of Rules.  Facts are your knowledge base (i.e. the
files you have in your context, the generators you have to choose from, etc.).
Rules are how to apply logic to that set of facts.  In Cocoon, the Environment
contains one set of facts that change with every request and the sitemap and
configuration files provide a set of facts that are constant.

The sitemap can be problem space can be viewed as a very simple Rules based system.
We make decisions based on URI, parameters, session variables, and more.  The set
of rules are simple if-then statements.  You can verify this by viewing the
generated source code for the Sitemap.  What we have run into is that the set of
rules available to us are too simple in some applications.  "If we are requesting
this set of information, send this resource."  This is most obvious when we have
multipage forms, and complex logic that has to be managed.

So how do we express the sitemap in JESS terms?  It would be something like this:

(defrule select-pipeline
     (environment (uri (concat "/foo/" ?X ".html")))
   =>
     (execute-pipeline (generator (source (concat "/foo/" ?X ".xml")) (type file))
                       (transformer (source "/bar/foo2html.xsl") (type xslt))
                       (serializer (type html))
     )
)

Ok, this is ugly.  But what if we changed this to be more granular?

(defrule select-transformer
     (environment (uri (concat ?X ".html")))
   =>
     (add-transformer (transformer (source "/bar/foo2html.xsl") (type xslt))
)

(defrule select-serializer
     (environment (uri (concat ?X ".html")))
   =>
     (set-serializer (serializer (type html)))
)

(defrule select-generator
     (environment (uri (concat ?X ".html")))
     (source (concat ?X "." ?Y))
   =>
     (set-generator (source (concat ?X ".xml")) (type ?Y))
)

(defrule execute-pipeline
     (generator (set true))
     (serializer (set true))
   =>
     (sitemap execute)
)

What this is saying is that we have a group of rules that all ".html" files are
to use the "html" serializer and the "xslt" transformer with the "/bar/foo2html.xsl"
stylesheet.  For the generator, there is a one-to-one mapping of generator type
to the extension.  (i.e. instead of "serverpages" generator it is renamed "xsp").

These can all be handled neatly with the current sitemap.  It is important to note,
however that the order of operations is that all rules are applied if they match.
Many times, we want to do something very complex that just can't be matched nicely.

For example, let's say we are in a multipage form that can exit based on if the
user is done adding items to their shopping cart which depends on data entered by
the user.  This absolutely cannot be simply expressed in the sitemap.  Let us say
that the two rules regarding transformers and serializers apply accross the board
and our only interest is in the next page of the form.  Remember that the data in
the form are facts that the rules engine can decide upon:

(defrule select-generator2
     (declare (salience 3))
     (session (add-more-items no))
  =>
     (set-generator (source "/foo/bar/baz.xml") (type xml))
)

What is this "salience" thing?  It is a precedence.  If multiple rules apply to
the same facts, then the one that has the highest salience wins.  That is more
of an expert shell type of thing, but it makes some excellent approaches.  I.e.
you have a base event that is fired, but it can be overriden by another rule.
That way you always have a default value.

It is also important to note that these rules are expressed like "and" statements.
There are also some other interesting ways of expressing patterns:

(defrule example
   (not-b-and-c ?n1&~b ?n2&~c)
   (different ?d1 ?d2&~d1)
   (same ?s ?s)
   (more-than-one-hundred ?m&:(> ?m 100))
   (red-or-blue red|blue)
  =>
   (printout t "Found what I wanted!" crlf)
)

So if there are facts where the "head" or type is "not-b-and-c" that has two
attributes and the first is not "b" and the second is not "c" along with
all the other facts found, the event "printout" will fire.

Now, I agree that Scheme is not an intuitive language, and very much LISP-like
(Lots of Imbeded Stinkin' Parentheses), it is easy to adapt to XML.  And more
importantly, to existing Java systems.  In other words, you can programatically
adjust the facts that the Rete decision engine uses with standard rules, and
watch it go to town obeying what you mean.  Coupled with a precedence scheme
that allows it to make a decision that when I have two sets of facts that
conflict, I can decide which rule is more important.  This is more powerful than
simply ordering the rules in the Sitemap.

I don't quite understand all of this either, but the gist is this:   A rules
based system is a way to declare the rules based on a known set of facts.  The
flexibility of a system like that allows a system that is astoundingly complex
be simplified.  The Facts are the availble resources (i.e. database resource,
file resources, session resources, request resources, etc.).  The rules are
a declaration of what to do with the facts.  The RETE algorithm is a way to
optimize the decision making process so that the decision making is more
constant time than linear time.

Such an approach would be interesting to follow....

Have I successfully confused everyone yet?  The link to JESS is better help.

-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message