xmlbeans-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Radu Preotiuc-Pietro" <ra...@bea.com>
Subject RE: CDATA Heuristics
Date Fri, 06 May 2005 20:55:06 GMT
The issue of CDATA and entitization has come up a lot of times.
XmlBeans is 100% infoset, but the XML infoset doesn't make any distinction as to how character
data is represented. So the approach that it took was to decide on its own when should characters
be entitized and when saved as a CDATA section. The algorithm is:
- if the length of the text is < 32 chars, entitization is used
- otherwise, if there are at least 5 '<' or '&' characters and they also account for
at least 1% of the text length, CDATA is used.

For V2, we looked into making this configurable, since we got feedback on this mailing list
that it would be useful, but never got around to doing it.

Here is one of the proposals:
- turn entitization on on a char by char basis via an XmlOption that basically says: "I want
character x to always be entitized"
- turn CDATA on/off on a per-document basis

What do people think?
Thanks,
Radu

-----Original Message-----
From: Patrick Hochstenbach [mailto:Patrick.Hochstenbach@ugent.be]
Sent: Thursday, May 05, 2005 11:27 PM
To: user@xmlbeans.apache.org
Subject: CDATA Heuristics



Hi,

in our library we are very interested using XMLBeans in a document 
archiving project which stores XML files in a database. The excellent
round-tripping characteristics of XMLBeans are crucial in
our project. But, with the serialization of text containing
escaped '<'-s and '&'-s we're at a loss. XMLBeans seems to
have some heuristics to decide when text containing these
characters should be saved as CDATA and when not.

Is it possible to decide at runtime when text should be saved in
CDATA sections and when not? Or better, can in some way CDATA
sections be preserved?

Best regards,

Patrick


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: user-help@xmlbeans.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: user-help@xmlbeans.apache.org


Mime
View raw message