commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Jakarta-commons Wiki] Update of "Digester/FAQ/XmlRulesOrNotXmlRules" by SimonKitching
Date Sat, 10 Sep 2005 02:34:11 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jakarta-commons Wiki" for change notification.

The following page has been changed by SimonKitching:

New page:
Describe Digester/FAQ/XmlRulesOrNotXmlRules here.

== Overview ==

The Digester object needs a set of rule objects in order to define how to process the input
These can be configured using the API:
  digester.addObjectCreate("some/path", ...);

or via an xmlrules file:
  Digester digester = DigesterLoader.createDigester(rulesURL);
where rulesURL refers to a file looking like this:
  <pattern value="address-book">
    <pattern value="person">
      <object-create-rule classname="Person"/>

== Supporters of xmlrules ==

A number of articles recommend the use of xmlrules, including:

A number of books published about jakarta commons projects also recommend xmlrules.

The original authors of the xmlrules project obviously thought it a good idea. However none
of them have
been involved in Digester development for many years now.

== Opponents of xmlrules ==

One digester developer (Simon Kitching) recommends avoiding the xmlrules module and staying
with the standard java API
for configuring rules. No other recent digester developer has expressed any opinion on this

=== Simon's reasons for avoiding xmlrules ===

I'm just not convinced it brings much benefit. When I first met digester, I started using
the xmlrules module
because it seemed to be the "highest abstraction". But in fact I found that the xml being
parsed and the java classes
that are being created/populated are conceptually tightly coupled anyway. There just isn't
much need to change the 
xml->class mapping unless the classes are being changed. And if the classes are being changed
then it isn't really any
more work to change a mapping defined as code than to change a mapping defined in an external
rules file.

And anyone writing an application using digester will already be fluent in Java, so moving
the mapping from code to 
external xml file doesn't make life any easier that way. In fact, I think it makes things
harder; I find the API easier
to comprehend than the xmlrules format. And certainly if you have any "bugs" in your mapping,
then you really need to 
know how the rule classes work. So in summary, the learning curve is *worse* for learning
xmlrules than for learning
the underlying API.

And it's a nuisance for the digester library maintainers, because after adding a new feature
to the API, we need to 
add it to xmlrules as well. And writing unit-tests for xmlrules is a nuisance too.

And xmlrules has significant overhead when processing small input files, because the xmlrules
file needs to be parsed 
first to set up all the rules.

And there are some features you just can't access via xmlrules. One example is passing references
to arbitrary java 
objects via the ObjectParamRule.

I can see xmlrules being useful in some situations. Maybe if writing some code-generation
tool (eg from a UML diagram) 
then it may be easier to generate xmlrules definitions than calls to the digester API.

But in general, I think xmlrules is *harder to learn*, has runtime CPU and memory overhead,
and brings no practical benefit.

And one other thing: it is very easy to add custom Rule classes when using the API; it's somewhat
more complex to do 
so when using xmlrules.

== Parsing xml files in multiple languages ==

The xml files that digester parses can be in multiple languages, eg an addressbook
might look like:
or might look like:

In this case the parsing rules stay the same but the pattern-matching strings associated with
those rules change.
By having the rule definitions in an xmlrules file translation would be simple. 

However this could also be achieved while using the java api via:
  digester.addObjectCreate(props.getProperty("ADDRESSBOOK_PATTERN", ....);
where an external properties file maps the keys to the appropriate pattern string.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message