abdera-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Beaulieu" <dbeaul...@moreover.com>
Subject Abdera chokes parsing content with type of html
Date Fri, 26 Mar 2010 15:28:57 GMT
Hi All, I am evaluating a few java atom parsers for a project. I am trying to parse a sample,
seen here -> http://pastebin.org/124779

, that is pulled from the wordpress stream. As you can see the content tag has attribute type
with value html, but the html isn't encoded. Abdera doesn't like this. It fails with error


 

com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </content>; expected </BR>.

 at [row,col {unknown-source}]: [24,528]

 

Is there any way to make abdera lenient when it comes to invalid xml? While I appreciate standards,
I am in no position to change the WordPress stream. 

 

For a simple test to replicate here is all I am doing:

 

// create abdera and input stream from sample above.

Document<Entry> doc = abdera.parse(is);

Entry feed = doc.getRoot();

System.out.println(feed.getContent()); ß It fails here.

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message