cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <sylvain.wal...@anyware-tech.com>
Subject Re: AW: The Problems with Sitemap Factories
Date Fri, 12 Oct 2001 14:54:11 GMT

Carsten Ziegeler a écrit :
> 
> > Sylvain Wallez wrote:
> >
> > Carsten Ziegeler a écrit :
> > >
> > > Hi Sylvain,
> > >
> > > > Sylvain Wallez wrote:
> > > >
> > > > +1. I find these too much complicated and like the idea of an
> > > > interpreded sitemap, and even more flowmaps !
> > > >
> > > > But in order to still have a good performance we need a way for some
> > > > matchers to prepare patterns (e.g. precompile regexps) during sitemap
> > > > initialization. So what about a "PreparedMatcher" ?
> > > >
> > > > interface Matcher {
> > > >   Map match (Object pattern, Map objectModel, Parameters parameters);
> > > > }
> > > >
> > > > interface PreparedMatcher {
> > > >   Object prepare(String pattern);
> > > >   Map match (Object pattern, Map objectModel, Parameters parameters);
> > > > }
> > > >
> > > > As you see, this requires a little modification of the
> > Matcher interface
> > > > since the "pattern" parameter will be an Object :
> > > > - it's the pattern String (as now) for non-prepared matchers,
> > > > - it's the object returned by prepare() for prepared matchers
> > > >
> > > > Or we can also simply add prepare() to Matcher and have
> > AbstractMatcher
> > > > return the pattern as is.
> > > >
> > > Could you give an example of a real matcher and how that would implement
> > > the interface?
> >
> > Here is the RegexpURIMatcherFactory transformed as a Matcher with the
> > additionnal prepare() method (which I finally prefer to a separate
> > PreparedMatcher interface) :
> >
> > public class RegexpURIMatcher implements Matcher {
> >
> >   // Compile the pattern as a RE object
> >   public Object prepare(String pattern) throws Exception {
> >     RECompiler comp = new RECompiler();
> >     REProgram program = comp.compile(pattern);
> >     String pat = correctPattern(pattern);
> >     return new RE(pat);
> >   }
> >
> >   // "pattern" is the RE returned by prepare()
> >   public Map match (Object pattern, Map objectModel, Parameters
> > parameters) {
> >     RE re = (RE)pattern;
> >     String uri = XSPRequestHelper.getSitemapURI(objectModel);
> >     if(uri.startsWith("/")) uri = uri.substring(1);
> >
> >     if(re.match(uri)) {
> >       HashMap map = new HashMap();
> >       int parenCount = re.getParenCount();
> >       for (int paren = 1; paren <= parenCount; paren++) {
> >         map.put(Integer.toString(paren), re.getParen(paren));
> >       }
> >       return map;
> >     } else {
> >       return null;
> >     }
> >   }
> >
> >   public String correctPattern(String pattern) {
> >     ... (same as RegexpURIMatcherFactory)
> >   }
> > }
> >
> > You can see from this example that the only performance cost compared to
> > a factory is at sitemap startup time where the regexp is compiled
> > instead of beeing created from a byte array. Request handling is
> > identical.
> >
> Ok, I understand this. So the usual order of calls is:
> - When the sitemap is generated, a matcher object (for each type) is
> instantiated.
> - If the matcher is preparable:
>       For each pattern occuring in the pipelines the prepare(method) is
>       used and the pattern is "replaced" with the result
> - In the pipelines the match is done using the match() function with
>   either the pattern or the prepared pattern.
> Right?

Exactly. I also finally think we could have a single Matcher interface
including the prepare() method. Implementations that don't want to
prepare something just return the pattern unchanged. This will reduce
complexity on both the implementor's side (doesn't have to choose
between prepared/unprepared) and sitemap's side (same handling of all
matchers).

> There is only one thing I don't like: The necessary casting of the
> pattern in the match method(). I think it should still be a String
> and the prepare() method should also return a string.
> So, next thing we could add is the SourceResolver to the match method().

In many cases it can't be a String : in the example above, the RE class
is an object representing the compiled regular expression (a FSM parser
?). If the pattern parameter is forced to be a String, then the regexp
cannot be precompiled and thus we loose the benefit of prepare().

I also don't think pattern (or better named "preparedPattern") being an
Object in select() is a problem, since it's producer (the prepare()
method) is in the same class as it's consumer (the match() method). The
contract defining the actual type of the prepared pattern doesn't need
to be known outside the Matcher class itself.

> > > > I don't know if Selectors could also benefit from a
> > "PreparedSelector".
> > > >
> > > > And since we're talking about matchers and selectors, a few thoughts
> > > > (for 2.1 ?) :
> > > > - What about giving matchers and selectors access to the
> > source resolver
> > > > ?
> > > I can remember that we had a discussion on this, but I can't remember
> > > the agreement we made. Do you recall it?
> > > If anything is against it, we should add the source resolver to the
> > > interfaces *if* we change the interfaces anyway.
> >
> > IIRC, there wasn't any clear conclusion, maybe because this requires
> > interface changes. But I'm +1 for adding them if interfaces have to
> > change because of the removal of CodeFactory.
> >
> > > > - I've always been confused that selectors don't have a
> > method called on
> > > > "map:select" to set up a context that could be reused between the
> > > > different "map:when" alternatives. In many cases this would allow to
> > > > reduce select() to a simple equality test and thus speed up
> > processing.
> > > >
> > > Could you please give an example here, too?
> > >
> >
> > Here it is, based on the BrowserSelector !
> > The new prepare() method is called on <map:select> and returns an object
> > which is passed back to select() (called for each <map:when>).
> >
> > prepare() allows to gather information that is required to all
> > <map:when> to do their job. Maybe in this example this isn't very
> > CPU-costly, but we have a real benefit if the context involves a query
> > in a database. Additionnaly, the "Vary" header is set once, while in the
> > factory version it is set for each <map:when>.
> >
> > public class BrowserSelector implements Selector {
> >   // Association of select names to array of user-agent strings,
> >   // built in configure()
> >   Map agentNames;
> >
> >   public Object prepare(Map objectModel, Parameters parameters) {
> >     // Indicate this response depends on the user-agent
> >     XSPResponseHelper.addHeader(objectModel, "Vary", "User-Agent");
> >     // Return the user-agent from the request
> >     if (objectModel.get(Constants.REQUEST_OBJECT) != null) {
> >       return XSPRequestHelper.getHeader(objectModel,"User-Agent");
> >     }
> >     return null;
> >   }
> >
> >   public boolean select(String expression, Object selectContext, Map
> > objectModel, Parameters parameters) {
> >     if (selectContext == null)
> >       return false;
> >
> >     String[] agents = (String[]) agentNames.get(expression);
> >     if (agents == null) // no definition of expression
> >       return false;
> >
> >     String userAgent = (String) selectContext;
> >     for (int i = 0; i < agents.length; i++) {
> >       if (userAgent.indexOf(agents[i]) != -1)
> >         return true;
> >     }
> >     return false;
> >   }
> > }
> >
> Ok, same as for the matchers applies to this selectors: String as input,
> SourceResolver and I would like to remove the XSP dependency (I know,
> this is also in the sitemap.xsl, but it's actually not necessary).
> 
And same remark about Strings ;)
XSP dependencies are here because I have copy/pasted existing code :
these are just convenience wrappers to access directly the request and
response and can easily be removed.

> But mainly, I agree with your suggestions. So let's wait for other
> opinions.
> 
Great !
Sylvain.

> Carsten
> 
> > > Carsten
> > >
> > > > Sylvain.
> > > >
> > > > Carsten Ziegeler a écrit :
> > > > >
> > > > > Hi Team,
> > > > >
> > > > > I would like to propose that we get rid of the sitemap factories,
> > > > > the selector and matcher factory.
> > > > >
> > > > > I see at least three reasons for this:
> > > > > - If you want to use a matcher factory inside a subsitemap,
> > > > >   you currently MUST redefine it in the subsitemap as it is
> > > > >   not "inherited" from the parent sitemap. This is true
> > > > >   of course also true for selectors (I entered this already
> > > > >   in bugzilla).
> > > > >   Using matchers and selectors in subsitemaps becomes very
> > > > >   error prone as you always as a sitemap editor have to be
> > > > >   aware if it is *implemented* as a factory or not. I think
> > > > >   the sitemap editor does not have to know about such technical
> > > > >   details.
> > > > > - The factories are hard to code. Java code generated from strings
> > > > >   is not so easy to write.
> > > > > - This is needed for the new RT, like the recent Tree
> > traversal approach
> > > > >
> > > > > So I'm +3 on removing the factories and this even for the
> > final release!
> > > > >
> > > > > Carsten
> > > > >
> > > > > Open Source Group                        sunShine - b:Integrated
> > > > > ================================================================
> > > > > Carsten Ziegeler, S&N AG, Klingenderstrasse 5, D-33100 Paderborn
> > > > > www.sundn.de                          mailto: cziegeler@sundn.de
> > > > > ================================================================
> > > >
> > > > --
> > > > Sylvain Wallez
> > > > Anyware Technologies - http://www.anyware-tech.com
> > > >
> >
> > --
> > Sylvain Wallez
> > Anyware Technologies - http://www.anyware-tech.com
> >
-- 
Sylvain Wallez
Anyware Technologies - http://www.anyware-tech.com

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message