forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joerg Pietschmann <>
Subject Re: [RT] Entities in XML docs
Date Sun, 29 Dec 2002 19:38:04 GMT
On Sunday 29 December 2002 04:47, Jeff Turner wrote:
> That was Stefano's suggestion: that we do text-only expansion for now
> (element expansion is still possible with xinclude), and when we migrate
> to a decent schema language we can think about removing the text-only
> restriction.

Why not migrating to either a more powerful schema language
or another validation process right now?
AFAIR your proposal was meant as a mechanism to supplant
XML entities, in particular in contexts where it is hard for users
to get their entity definitions into the DTD.
The problem you want to avoid is that a document with <xi:include>
or <nn:replace> would not validate.

Entities work because they are part of the DTD agains which the
parser validates and because the parser expands them before
examining the context for validation.
In any other approach, the parser does not know about the
substitutions to be made. Because the validation is, historically,
still an integral part of the parsing step, rather than a separate
step, this may cause problems. This is independent whether
the substitution is done by XInclude, an XSLT replacing
<nn:replace> elements or ${} substitution.
This doesn't mean we can't solve the problem: Run a processor
doing the expansion, then a validator. If performance doesn't
matter all that much, an intermediate file can be used. Unfortunately,
I don't know of any validator taking a SAX event stream as input
for better performance, but I'm sure if the need arises, someone
will take care about this. The only problem remaining are schema
directed editors.

> I don't fully understand why we can't give users the option to shoot
> themselves in the foot by including elements, but implementation-wise
> there's little difference (two different InputModules).
An easy implementation doesn't mean there are no problems.
1. Entity expansion is recursive. Is ${} expansion recursive too?
  Like foo -> ${bar} and bar -> baz.
  How do you avoid loops? <evil grin>
2. Is something like ${${foo}} allowed, supposed "foo" is substituted by
  "bar" and "bar" by "baz"? Don't forget to explain the difference to
  recursive expansion as in 1.
3. An XML file with a ${} substituted by a subtree with mandatory
  elements at the place is not valid. For example
  <!DOCTYPE foo [
    <!ELEMENT foo (a)>
    <!ELEMENT a #PCDATA>]>
  and foo expands to <a>bar</a>.
  That's the point of restricting substitutions to text.
4. Elements in ${} substitution get their namespaces from the repository,
 I think. Like if foo -> <nn:a>, the binding for the nn prefix is taken from
 the repository XML file rather than from the document where ${foo}
 occurs. XInclude has the same problem, but then, the XInclude spec
 takes care of this aspect.
 Well, namespaces and entities mix even less well.
Last but not least I think giving users plenty of means to shoot themselves
in the foot is not a very good approach, even if the users demand them.
Read through the discussions about <xsl:script> on the XSL list for some

> > XML editors
> vim + xmllint
External validation, can be handled easily.

> > - Write a customized toolset.
> ?
The processor doing the substitution, perhaps catalogue support, cross
references, authoring support. Someone might also want to have a
processor working outside Cocoon.

> Just like the C preprocessor, It is an opt-in solution to a practical
> problem.
I've seen simple "solutions to practical problems" used and getting into
deep doo-doo in the long term much to often. This kind of pragmatism
brought us BASIC, file name suffixes denoting the content format, Tag
Soup and the unmentionable abominations related to what's commonly
called gHorribleKludge on XML-DEV. I still think the world would be a
better place if such abberations had been avoided. Also, propagators
of "pragmatic solutions" tend to walk on to the next buzz, leaving the
mess to others to clean up. :-/


View raw message