cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Portier <>
Subject Re: [RT] Comparing Woody & XMLForm : towards a unified form handling (long)
Date Tue, 22 Jul 2003 09:54:55 GMT
Sylvain et all,

We were patiently expecting that once it would be that:
Sylvain Wallez wrote:
> Hi all,

first of all: great work and gentle introduction, many thx

> Lately, I've been thinking a lot about form handling in Cocoon. The 
> reason for this is that I will very soon start a project which is 
> basically a large set of forms (about 40 different screens used to fill 
> an XML document containing collections having up to 1000 or 2000 items). 
> As part of our proposal for the project, I did some prototyping with 
> XMLForm (+flowscript) and liked its lightweight markup and the strong 
> separation it enforces between form definition and form layout. But I 
> disliked its poor syntactical validation facilities. On the other side, 
> we have Woody which is very good a validating data but which I find 
> heavy to use and defines its own schema language. So this RT is my 
> attempt to make a synthesis of the good and bad points of both 
> frameworks, augmented with my own ideas, so that we can move towards a 
> single unified form handling package in Cocoon.

here here!

it is not going to be an easy task for sure... the *one size fits 
all* -goal is not always reachable:

indeed: on the one side a framework like this should be _useful_ 
(in the sense of being complete and fast to use)
but: on the other hand should not be limiting anyone to _only_ 
the envisioned possibilities of its creators... (that is always 
the case, but I guess you understand what I'm trying to say)

the 80-20 rule will hopefully guide us (since that one is often 
creating the itches we want to scratch)

> Disclaimer : I don't want to start a war between Woody and XMLForm, but 
> just try to analyze what we have today and expose what I (hence it's 
> subjective) consider as good. Discussion is of course welcomed. Also, I 
> may have missed some features of one or the other framework. In that 
> case, please don't shoot at me, but be kind enough to explain what I 
> missed !

will do, and same position here, I can learn some from jxforms, 
and Woody can benefit from your proposed enhancements in the process

> Also, I'll speak about XMLForm, even if it's somewhat dead and replaced 
> by JXForms (essentially a cleaner rewriting of the XMLFormTransformer 
> and an update of the markup to the latest XForms draft), because all 
> criticisms about XMLForm below come from the original XMLForm and not 
> the JXForms work.
>                           ---oOo---
> General overview
> ----------------
> Both Woody and XMLForm use the same basic principles :
> 1/ Content production : a form template is "instanciated", i.e. it is 
> filled with values coming from a data model, and the instanciated form 
> is transformed to the target language (e.g. HTML) using generic and/or 
> custom stylesheets that know how to render the various widgets.
> 2/ Form validation : upon form submission, values are validated and 
> stored into a data model, and violations are produced if some validation 
> error occurs (validations involving several fields are also possible). 
> In case of error, the form can be redisplayed with the violations.
> But, as we will see below, the notions of form template, data model and 
> validation are very different in Woody and in XMLForm.
>                           ---oOo---
> Form definition
> ---------------
> Woody separates form definition, form template and form instance (3 
> different namespaces). The form definition is a kind of schema language 

the new form binding introduces a 4th namespace
but the good news is that the form-instance tags are not to be 
developer-written (hm, but the xsl converting it into whatever is 
dealing with it)

> that defines every widget in the form with its label, datatype and 
> validation constraints. The template contains references to form fields 
> mixed with foreign markup (such as HTML). It is instanciated using the 
> WoodyTransformer : every field present in the template is replaced by 
> the corresponding instance acccording to the form definition.


one small remark: you could choose not to use the template.
in that case one uses the WoodyGenerator which will produce the 
FULL XML representation of the form-instance at the start of your 
pipe (foreign markup is then typically decorated on the stream by 
any wild mix of xslt, xinclude, ...)

but I get your point:

if I got this correct then the big advantage of xmlforms you 
stress here is that the 'template' defines the _model_ while 
woody explicitely has a separate file for the latter, correct?

in fact woody introduced the template approach to be able to skip 
the XSLT requirement, so opting for no-template will leave you in 
probably an even worse spot...

> Woody has no notion of application model, as it stores field values in 
> it's own data structure, which must be read and written to the 
> application model. Work is underway in this area with a JXPath based 
> binding.


> XMLForm has only one markup, inspired by the W3C's XForms specification. 
> This markup is more or less equivalent to the Woody template (it accepts 
> foreign markup), which is instanciated ("augmented" would be better) 
> with either the XMLFormTransformer/JXFormsTransformer or the 
> JXFormsGenerator. Form fields contain XPath references to the data 
> model, which can therefore have an arbitrary complexity.
> <my-opinion>
> XMLForm is way easier to setup to produce forms : a single file, a data 
> model containing any mixture of objects handled by JXPath (JavaBeans, 
> DOM elements, etc), XPath expressions everywhere, and you're done. But 
> as soon as there's a need for data whose formatting is more than 
> toString(), such as dates and float values, and even more in an I18Nized 
> environment, XMLForm shows strong limitations, mainly related to lack of 
> proper formatting functions in XPath.
> As JXPath supports extension functions, building a library of formatting 
> functions can be a solution to circumvent XPath's reduced function set. 
> But we'll see below that there's still a problem with parsing submitted 
> form data.
> Woody, on the other hand, is more complicated to set up, as two files 
> are needed (form definition and form template), with many 
> cross-references (field IDs). But Woody shines for complicated 
> formatting (see <convertor> directives) and I18N.
> IMO, Woody's separation of concerns between form definition and template 
> is not that good. Woody would be easier to use if the definition file 

every identification of a concern indeed creates a new 
responsibility to be taken up...
it has been the classic approach to have those different 
responsibilties be expressed through different files/namespaces 
(since multiple people could/should be involved, thus allowing to 
map responsibilities onto person-attached files)

in this case this is leading us to 'woody requires a lot of 
configuration' (your remark on this is not new and the upcoming 
binding is likely to make it worse)

so any suggestions on sensibly cutting some of the config trouble 
makes all the sense in the world.

the full process of getting anywhere has 2 aspects IMHO:
1/ identify all the concerns and separate wisely
2/ recombine sensibly in 'assmebled' typical usages that lower 
the 80-20 itch for specific use cases... (in dream mode: generate 
the different config needs from a single source that might be as 
wild as JDBC metadata information?)

Woody up to now had some stress on 1/ (and I think we're not even 
there) it sure makes sense to start considering 2/ if we want to 
increase its 'usability'

> was only a schema defining datatypes and if fields were defined only in 
> the template. Although there is a great probability that datatypes can 

mmm, I might be missing what you are saying here...

looking at the woody-definition file what I see is exactly 
identifying data-types, but then I'm talking about composite 
types rather then only single-field types (e.g. the composite 
'person' type vs. his 'birthday'-date-field)

so what I might be reading wrongly here is that you would like to 
see the set of woody-form-definition files to evolve into some 
'datatype-catalogue' ?

hopefully that would still include the composite types it 
focusses on now, and just provides for a reuse mechanism of 
forms-subforms and predefined 'field-typs'

> be reused for different fields and even different forms, I'm not sure 
> using the same fields within different templates really make sense. For 
> example, HTML and WML browsers have so much different screen sizes and 
> interaction constraints that a single form definition can hardly be used 
> for both.

the envisioned use case behind the current split is exactly this: 
if we consider the HTML and WML front end versions of the same 
use cases then most likely the templates will need to change, but 
the model could remain the same, no?

e.g. in the case of WML you'ld probably split the complete 
editing of the one form-model over a wizard-managed series of 
templates... (and/or you would choose not to show the optional 
fields of the model)

to be hoped for is that both cases would reuse:
- label/help/hotkey info (i18n)
- validation rules on the complete form-model
- the logic loading and submitting the complete filled-in 
'form-model' back to whatever back-end

other examples would include:
- deploying the web application in an ASP model where the 
different tennant-companies only provide their templates (leaving 
out optional fields, splitting large models in different 
ways,...) and reuse the same back-end logic

- not use any template at all, but just expose a ReST-like URL 
pattern people can send requests to, and receive an XML back?

I'ld have to admit these are not classic use cases but hardly to 
be overlooked by a modern form framework if you ask me... again I 
think the goal of being widely usable probably starts by 
separating the different concerns in the core of the thing and 
then carefully combining some of them back again in very 
specific/targetted ways of deployment?
(I see current cocoon success as proof of that pudding)

> Reusing datatypes for different fields would also increase the overall 
> application consistency : as of today, if two fields have the same 

agree, dataype-catalogue and form-subform recursiveness for 
composite types could achieve this

I still see the split from the template as having a different use.

so by using all these words to be intellectually correct I just 
forget to stress that I very much share your fear for the 

How to get a very practical way to setup a one file config 
approach for simple stuff :-( without compromising the wider 
usage and flexibility?

suggestions welcome...

> datatype and constraints, these must be duplicated. This could also open 
> the door to other schema languages (WXS, RNG, etc).
> </my-opinion>
>                           ---oOo---
> Population and validation
> -------------------------
> "Population" is the term used to designate the action of "filling" the 
> data model with form-submitted data. "Validation" is the action of 
> controlling that submitted data is valid, i.e. that is satisfies some 
> syntactic and semantic constraints.
> Upon form submission, XMLForm traverses all request parameters and tries 
> to set their value on the data model using JXPath. A feature allows to 
> filter request parameters that are not part of the data model. If the 
> data model was filled correctly, a validation is performed using 
> Schematron. This allows to have finer-grained or inter-field controls, 
> again using XPath expressions. Each of these two phases can produce 
> violations, which are recorded in the Form object.
> Upon form submission, Woody traverses the form's widget tree, and each 
> widget is responsible to parse the corresponding request parameter and 
> validate it's value. Non-visual widgets are also provided to perform 
> inter-field controls.
> <my-opinion>
> Here again, XMLForm is very easy to use but shows some strong 
> limitations : because it's designed after XForms, XMLForm has no feature 
> to specify how to parse form parameters (strings) into strongly typed 
> data. So even basic parsing of e.g. dates is not possible, and 
> locale-dependent parsing is clearly not possible.
> The Schematron validation has less restrictions since it deals with the 
> populated data model, and thus on strongly typed data, if they could be 
> parsed in the population phase.
> XMLForm also has what I consider a strong security weakness : the 
> default request parameter filter rejects only special parameters such as 
> "cocoon-action-*", which means that a request can be hacked that 
> modifies a part of the data model that wasn't available as a form field. 
> Considering that programmers are lazy (as I am), the form model will 
> often be the actual business object. The consequences of providing a 
> form to a user to update her location information can be catastrophic if 
> the User class contains "address", "phoneNumber", but also 
> "accessRights"...
> W3C XForms, which inspired XMLForm, is a client-side specification 
> targeted at producing XML documents validated by a WXS (W3C XML Schema). 
> But XMLForm is server-side, and doesn't enforce any particular schema 
> language. This means that very few features of XForms are actually used 
> except the form markup and that all has to be invented to produce a 
> featured server-side form framework, particularily in this population & 
> validation phase.
> Woody, by traversing the widget tree that was used to produce the form, 
> doesn't have the security weakness of XMLForm since only parameters 
> present in the produced form are considered. Also, it's strong parsing 
> and I18N features make custom formatting really easy.
> But, being limited to the form's data model, complex validations 
> involving form data and application data can be difficult to do with 
> Woody and will need custom Java code.

yep, touching the 80-20 rule again
validation can become just really arbitrary complex so an escape 
to provided java-code is not to be prevented IMHO

how much of the stronger validation will be available under 
declarative form is then a matter of time, itches and scratches

> Finally, Woody uses its own expression language, with IMO is not a good 
> choice if we consider that "standard" expression languages such as Jexl 
> exist and are already used in other Cocoon blocks.

no real opinion here,
consistency with other stuff does make sense fo course

I might be wrong but the simple expression syntax is more a means 
for reusing by assembling smaller parts of ready Java code

the alternative would be that you'ld need to write a Java class 
that does that assembly and then the definition file would list 
the full qualified classname of that new beast.

I don't know jexl enough to evaluate it for this usage...  your 
advise and hints are welcomed

> </my-opinion>
>                           ---oOo---
> Mapping to the application data model
> -------------------------------------
> A form is useless if its content cannot be mapped in some way to the 
> application data model.
> XMLForm has no special provision for mapping form data to application 
> data, but using JXPath makes it easy to fill any JavaBean or any DOM 
> structure. Post-validation application behaviour can be added to either 
> a subclass of AbstractXMLFormAction or in a flowscript.
> Woody currently does not provide anything to map form data to 
> application data and all this must be coded either in a subclass of 
> AbstractWoodyAction or in a flowscript. But there's work underway to add 
> binding features to Woody, the first incarnation being based on JXPath.
> <my-opinion>
> XMLForm makes it easy (as pointed out above) for the lazy programmer to 
> set the application data as the form model : mapping is then immediate 
> and totally transparent. But along with the security problem mentioned 
> above, this also means that when a form population & validation fails, 
> it is very likely that some fields already have been modified, 
> potentially leaving the data model in an inconsistent state.
> So the secure and clean solution is to use a form-specific data model (a 
> JavaBean, DynaBean or XML DOM), but this requires then custom code to 
> copy form data to the application data model, thus loosing the 
> simplicity provided by JXPath.
> The ongoing work on Woody binding potentially allows a great range of 
> target data models : the current JXPath binding will make it easy to map 
> form data to an abitrary data structure, without XMLForm's limitations 
> since parsed and strongly typed data will be stored in the application 
> model. But we can also imagine other declarative bindings targetted at 
> e.g. relational databases (no intermediate bean), EJBs, etc.


haven't looked into jxpath deep enough (just used it now, didn't 
get into the internals yet) but my current feeling would be to 
cater for these other backend-models by writing a specific 
JXPathContext wrapper... as such the effort would be reusable 
more widely?

> </my-opinion>
>                           ---oOo---
> I18N
> ----
> I18N features should be separated in two main areas :
> - I18Nization of form labels and item values (i.e. combobox labels)
> - I18Nization of textbox inputs, such as floating point numbers, dates, 
> etc.
> For the first item, both XMLForm and Woody accept any foreign markup in 
> widget labels, including <i18n:*> tags for use with the I18NTransformer. 
> Woody lacks the equivalent to <xf:help> but this was recently discussed 
> and should be added soon. XMLForm also allows labels and similar items 
> to have their content fetched from the form model using a "ref" 
> attribute. In that case, however, only characters are produced, and not 
> mixed content.
> For the second item (i18nization of inputs), XMLForm has no support, as 
> it hardly supports custom formats, as explained previously. Woody, on 
> the other hand, has strong support for i18nization of inputs through its 
> <convertor> tag that supports locale-specific patterns for formatting 
> and parsing.
> <my-opinion>
> XMLForm's strong limitations for values formatting also apply to the 
> i18n domain, whereas Woody not only provides strong support for value 
> formatting, but also strong support for locale-dependend formatting.
> XMLForm's "ref" attribute on form labels allows messages to be part of 
> the form model, and thus be dynamic, but I'm not sure this is of real 
> use. And if it is, Woody may be able to provide an equivalent through 
> nested tags in the <wd:label> element.
> </my-opinion>
>                           ---oOo---
> Conclusion
> ----------
> XMLForm has a lot of success because it has filled a giant need in 
> Cocoon applications to handle forms. Moreover, it fits nicely with 
> flowscript, and this combination builds an easy to use solution for form 
> handling. But using it in more and more complex use cases show some 
> strong limitations that are largely related to its desire to mimic 
> XForms. And I'm not sure these limitations can be removed without 
> diverging largely from the XForms approach.
> These limitations were obviously taken into account early in Woody's 
> design, which make it stronger at handling data formatting and enforcing 
> semantic constraints. But Woody, by over-separating concerns, is more 
> heavy to use.
> Considering all the pros and cons, I think Woody, which is still in its 
> infancy, is more promising on the long term and should be promoted, once 

mmm, lets make that
* puberty (flirting around with different ideas, allowing 
oneselve to think the wrong ones even :-))

which means we still have to get through
* adolescence (forming the real identity, gradually taking up 
responsibility towards early adopters) and
* maturity (living its life, being used) before we reach the 
stage of
* aged wisdom (where we have removed everything there was to 
remove, dear Antoine)

IMHO a number of good discussions and try out code could get us 
fastly passed the first two stages

> featured enough, as the preferred form handling package in Cocoon.
>                           ---oOo---
> Proposals
> ---------
> We've seen that Woody requires to separate form definition from form 
> template. I think (Bruno, correct me if I'm wrong) this constraint comes 
> from the fact that the form _is_ the model, and thus must be filled with 
> data _before_ being processed by the form template.


> The ongoing work on form binding considers binding as a process 
> surrounding form population and validation : the application->form 
> binding fills an existing form, and the form->application binding 
> transfers form data to the application model once the form is correctly 
> validated.
> Now we can imagine to have a "live" application->form binding occuring 
> at form definition time which could allow simultaneous building of the 
> form definition and population of form data from the binding. This 

above sounds like allowing 'default' values in the form definition

combined with the fact that the default values themselves would 
be collected at form-instantiation time?

care to elaborate how you saw this happen?

> feature could remove the need for a separate form definition and could 
> be implemented by a WoodyTemplateGenerator taking as input a template 
> file containing field definitions. A kind of "definition by example" 
> (like the QBE that exists in Excel and various database systems).
> This "defining-template" would only define fields and not datatypes. 
> These datatypes could be either inferred from the application model 
> trough the binding or fetched from a separate schema file (the current 
> form definition, with only datatypes definitions).
> On the other hand, form->application binding cannot be live, since we 
> must ensure that all submitted value are valid before modifying the 
> application data.

IIUC you're giving a shot at redistributing the information from 
the three current config files (form-defition, form-template and 
form-binding) into two reshaped ones: form-template and 

maybe an example of how things would look like would give us more 
  to chew on?

>                           ---oOo---
> Thanks for reading so far. As I expect this post to generate lots of 
> discussions, I suggest to create separate threads for particular 
> subjects (particularily the final "proposals" chapter) in order to keep 
> the discussion focused.

didn't do at this moment yet... if everything remains related I 
prefer one combining thread then multiple separates (but that's 
just me, I'll be happy to go with the flow and see some separate 
[woody proposal] threads pop up)

> Sylvain

Marc Portier                  
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at                        

View raw message