shindig-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincent Siveton (JIRA)" <j...@apache.org>
Subject [jira] Updated: (SHINDIG-987) NekoParser returns cryptic error messages when parsing bad html
Date Sun, 05 Apr 2009 10:59:13 GMT

     [ https://issues.apache.org/jira/browse/SHINDIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vincent Siveton updated SHINDIG-987:
------------------------------------

    Attachment: SHINDIG-987.patch

Simple patch

> NekoParser returns cryptic error messages when parsing bad html
> ---------------------------------------------------------------
>
>                 Key: SHINDIG-987
>                 URL: https://issues.apache.org/jira/browse/SHINDIG-987
>             Project: Shindig
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: trunk
>            Reporter: Paul Lindner
>         Attachments: SHINDIG-987.patch
>
>
> startImportantElement can throw exceptions when parsing malformed html:
> Given this html:
>     <div id="div_super" class="div_super" valign:"middle"></div>
> You get an exception like this:
> org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character
is specified. 
> 	org.apache.xerces.dom.CoreDocumentImpl.createAttribute(Unknown Source)
> 	org.apache.xerces.dom.ElementImpl.setAttribute(Unknown Source)
> 	org.apache.shindig.gadgets.parse.nekohtml.NekoSimplifiedHtmlParser$DocumentHandler.startImportantElement(NekoSimplifiedHtmlParser.java:292)
> 	org.apache.shindig.gadgets.parse.nekohtml.NekoSimplifiedHtmlParser$DocumentHandler.startElement(NekoSimplifiedHtmlParser.java:242)
> 	org.apache.shindig.gadgets.parse.nekohtml.SocialMarkupHtmlParser$SocialMarkupDocumentHandler.startElement(SocialMarkupHtmlParser.java:130)
> Which is caused here:
>       for (int i = 0; i < xmlAttributes.getLength(); i++) {
>         if (xmlAttributes.getURI(i) != null) {
>           element.setAttributeNS(xmlAttributes.getURI(i), xmlAttributes.getQName(i),
>               xmlAttributes.getValue(i));
>         } else {
>           element.setAttribute(xmlAttributes.getLocalName(i) , xmlAttributes.getValue(i));
>         }
>       }
> because we're trying to set a tag with a colon in it.
> We should probably add some error checking here so that we can more easily identify the
offending HTML without using a debugger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message