commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Francois Arcand <>
Subject Re: digester 2.0 [WAS Re: [digester] [PROPOSAL] More pattern matching flexibility]
Date Tue, 03 Sep 2002 15:40:47 GMT
Not sure if I can join the discussion ;-)....anyway, here is some ideas 
for Digester 2.0. We should also add some methods for:

- add getter/setter for setting the ContentHandler (if someone doesn't 
want to use the DigesterContext)
- add getter/setter for setting the ErrorHandler and the DTDHandler. 
Having experienced hard problems with Xerces 2.0.1 - 2.0.2 and XML 
Schema, being able to set my own ErrorHandler will be usefull.

It might be a good idea to have a method on the Matcher interface like

public void setErrorHandler(ErrorHandler eh)

So different Matcher can customize their own ErrorHandler...

That will standardize the actual wrapper around the XMLReader....right 
now, only the EntityResolver is available for clients.

Just thinking loudly :-)

-- Jeanfrancois

Christopher Lenz wrote:

> robert burrell donkin wrote:
> [...]
>> Rules is an interface but since it's poorly named anyway, we might 
>> actually think about creating a new 'Matcher' class. we could then 
>> retain backwards compatibility by using an adapter. we could use 
>> cristopher's nice strategy to retain compatibility for the rule class.
> Your Matcher idea in combination with an adapter for backwards 
> compatibility is pretty damn good ;o)
> And it got me experimenting...
> Let's assume the responsibility of the Matcher interface would be only 
> to match string patterns against the current XML document context. For 
> the start, it could have an interface like this
>   public interface Matcher {
>       // add a pattern
>       public void add(String pattern);
>       // remove all patterns
>       public void clear();
>       // return all patterns that match the current XML document
>       // context, in the order they've been added
>       public List matches(DigesterContext context);
>       // return all patterns in the order they've been added
>       public List patterns();
>   }
> Note that the Matcher doesn't store the Rules themselves, that will be 
> the responsibility of Digester. Digester will just request the matched 
> patterns from the Matcher and then lookup the corresponding rules in a 
> map (for example).
> Now, there might be a simpler and yet more flexible design... what if 
> someone wanted to create a Matcher that understands full-blown XPath 
> expressions. In that case DigesterContext would be the limiting 
> factor, as the Matcher would not have access to - for example - the 
> attributes of the parent element. In other cases, the DigesterContext 
> might have more info than the Matcher requires. The algorithm in the 
> current RulesBase is enough for many use-cases and only needs a simple 
> string to match against. Here the DigesterContext adds unneeded overhead.
> So, my idea was to make the Matcher a SAX ContentHandler and not pass 
> a DigesterContext (or something similar) at all:
>   public interface Matcher extends ContentHandler {
>       ...
>       // return all patterns that match the current XML document
>       // context
>       public List matches();
>       ...
>   }
> Digester in turn would call all the Matcher's ContentHandler methods 
> from it's own ContentHandler implementation methods. The Matcher could 
> get all the information (and *only* the information) about the parsed 
> XML that it actually needs to perform it's matching tasks.
> I've done some experimenting with a RulesAdapter (i.e. a class that 
> implements Matcher but wraps around a Rules implementation), and it 
> looks like we could get away with almost no breaking API changes.
> Again, I actually like this approach much better than my own 
> DigesterContext idea... it seems cleaner, leaner and more flexible. 
> What do you think?

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message