commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torgeir Veimo <torg...@pobox.com>
Subject Re: digester parsing with html content
Date Wed, 27 Sep 2006 21:34:23 GMT

On 27 Sep 2006, at 22:23, Torgeir Veimo wrote:

> I'm trying to use digester for parsing xml that were previously  
> parsed with jaxb 1.0-ea. Some of the content is xhtml fragments  
> inside xml, eg.
>
> <body-text><xhtml>...</xhtml><body-text>
>
> and I'd like to retrieve the content as a String bean property.   
> However, I'd like the parser to threat the content of body-text as  
> opaque. Now it tries to parse it and chokes on eg. &oslash; entities.
>
> Any clues on how I can configure digester, or more precisely, the  
> underlying parser, to avoid these problems?

FYI, previously with jaxb, I was using this DTD:


<!ELEMENT article (title, lead-text?, body-text, ...)>
     <!ATTLIST article ... >

<!ELEMENT title (#PCDATA)>

<!ELEMENT lead-text (#PCDATA)>

<!ELEMENT body-text (#PCDATA)>

-- 
Torgeir Veimo
torgeir@pobox.com




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message