commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torgeir Veimo <>
Subject Re: digester parsing with html content
Date Wed, 27 Sep 2006 21:34:23 GMT

On 27 Sep 2006, at 22:23, Torgeir Veimo wrote:

> I'm trying to use digester for parsing xml that were previously  
> parsed with jaxb 1.0-ea. Some of the content is xhtml fragments  
> inside xml, eg.
> <body-text><xhtml>...</xhtml><body-text>
> and I'd like to retrieve the content as a String bean property.   
> However, I'd like the parser to threat the content of body-text as  
> opaque. Now it tries to parse it and chokes on eg. &oslash; entities.
> Any clues on how I can configure digester, or more precisely, the  
> underlying parser, to avoid these problems?

FYI, previously with jaxb, I was using this DTD:

<!ELEMENT article (title, lead-text?, body-text, ...)>
     <!ATTLIST article ... >

<!ELEMENT title (#PCDATA)>

<!ELEMENT lead-text (#PCDATA)>

<!ELEMENT body-text (#PCDATA)>

Torgeir Veimo

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message