commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig McClanahan <>
Subject Re: [Digester] class-specific parsing
Date Mon, 20 Dec 2004 02:56:26 GMT
While not necessarily covering *all* of your potential requirements,
you can get a long ways if you follow some relatively simple
conventions for using the standard rules:

* The "Set Properties" rule will match up attribute names on
  a source element to settable properties on the corresponding
  class.  Thus, any configuration that can be passed on as
  simple property settings does not need to be specific to the
  actual implementation class.  (If you're familiar with Tomcat,
  this is the rule that is used to parse virtually every element
  in a "server.xml" file, without needing any prior knowledge of
  the actual implementation classes for each element.)

* The "Set Property" rule will allow you to have nested elements
  that take a name/value pair to set an arbitrary property to an
  arbitrary value.

* You can leverage wildcard rules matches to make them
  independent of the position (in the nested hierarchy of
  XML elements in a file).

For example, consider the following rules configuration:

    Digester digester = ...;

    // Instantiate and configure a new widget
    digester.addObjectCreate("*/widget", "com.mycompany.DefaultWidget",

    // Also recognize nested <property> elements
    digester.addSetProperty("*/widget/property", "name", "value");

This set of rules will allow the configuration of *any* class
representing a widget, with no requirements that this class extend any
particular base class, or implement any particular interface.  It
would require some changes to your XML formats, though ...
particularly subtituting a <property name="x" value="y"/> element for
your <trinket> and <container> elements.

(As an alternative to the above, consider using a "Set Nested
Properties" rule, which will match each nested element's name to a
corresponding property, taking the body content of the nested element
as the value to be set.)

The harder part of the equation to deal with, though, is with
hierarchies.  Since we're often dealing with parent/child
relationships, one approach would be to require that any class you
wish to use a <widget> element on must include an addChild() method
that accepts an Object.  Then, you could add a rule like this to allow
<widget>s to be nested inside other <widget>s to an arbitrary degree:

    digester.addSetNext("*/widget", "addChild", "java.lang.Object");

A similar approach could be set up to deal with situations like your
nested <locations> element, if you're willing to use a fixed name for
the container:

"addProperty", "java.lang.Object");

The hints above will get you part way towards your goals.  If you
seriously want to meet the "full generality" goal, however, you'll
want to investigate writing your own custom Rule implementation(s) to
create a widget, and then to process all the subordinate elements in a
manner that is really specific to the original class.  Note that
dynamically adding new rules isn't going to cut it -- those new rules
would apply to all <widget> elements later in the file, where you
asked for the rules to be specific to this element.

One particular strategy might be to consider using the NodeCreateRule,
which will absorb all of the child elements and give you back a DOM
object representing them.  Consider creating a custom subclass of this
rule that uses the base class implementation to return you back a DOM,
and then passes the result to some processing class whose name you
also pass in as an attribute on the "widget" element.  So, you'd end
up saying something like:

    <widget className="foo.TrinketBean" valuation="cheap"

where the "processor" class would have some processing method that
accepted the object created by the rule (a TrinketBean in the case
above) and a DocumentFragment representing the nested XML content
inside this <widget>.  That way, you could encapsulate the actual
behavior into an instance of a Processor interface, which could be
specific to that particular kind of widget.


On Sun, 19 Dec 2004 18:23:26 -0800, Walter Korman
<> wrote:
> I'm using commons-digester to parse a config file and would like to
> set things up in an extensible manner, but am unsure of the best way
> to allow an arbitrary number of user-written classes to engage in
> their own custom parsing based on arbitrary class-specific data
> format.
> Appreciate any input from those who have spent more time working with
> the Digester than have I.
> Example file subset:
> <widget className="foo.TrinketBean" valuation="cheap">
>     <trinket name="emerald ring" color="green"/>
>     <trinket name="donkey" color="gray"/>
> </widget>
> <widget className="foo.ContainerBean" rarity="common">
>     <container name="basket" material="wicker"/>
>     <container name="bottle" material="glass"/>
>     <locations>
>         <location>Williams-Sonoma</location>
>         <location>Pottery Barn</location>
>     </locations>
> </widget>
> So, for each widget a className-specified object is to be
> instantiated, and its child elements parsed in a class-specific
> manner.
> TrinketBean might want to create TrinketRecord objects to correspond
> to each <trinket> element; or, it might just want to have an
> "addTrinket" method on itself called with the element attributes.
> The same for ContainerBean, which adds in the whole <locations>
> element wrinkle with a list of <location> elements to be stored.
> Since I want the widget element attributes, child elements, child
> element attributes, and overall parsing approach to be definable by
> the widget itself, the main code that configures the Digester doesn't
> know these in advance and so they can't be specified in an immediately
> straightforward manner.
> I see a few possible approaches:
> - Have each Widget provide a method to which the Digester is
>   passed when being configured to set up the Widget-specific parsing.
> - Fiddle around with a custom Rule.
> - Constrain the kind of widget-specific parsing permitted so that it
>   falls into a specific form that can be specified via up-front
>   Digester configuration but that would limit the flexibility of the
>   widget data format.
> - Walter
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message