sling-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason E Bailey (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SLING-6783) Updates for Commons HTML
Date Thu, 03 May 2018 16:18:00 GMT

    [ https://issues.apache.org/jira/browse/SLING-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462390#comment-16462390
] 

Jason E Bailey edited comment on SLING-6783 at 5/3/18 4:17 PM:
---------------------------------------------------------------

[EDIT] I don't see any additional reason to support any other Parser properties at this time
unless there was an ask. I'd rather focus on HTML compliance.


was (Author: jebailey):
We should either support them or at least document what is and isn't supported from a features
perspective. At this point I would just say documentation, I'm much more interested in finding
a way to make this html5 compliant then features that no one has yet asked for.

> Updates for Commons HTML
> ------------------------
>
>                 Key: SLING-6783
>                 URL: https://issues.apache.org/jira/browse/SLING-6783
>             Project: Sling
>          Issue Type: Improvement
>          Components: Commons
>            Reporter: Jason E Bailey
>            Assignee: Oliver Lietz
>            Priority: Minor
>             Fix For: Commons HTML 1.0.2
>
>         Attachments: sling.patch
>
>
> Following updates:
> Updated tagsoup lib to 1.2.1 which has the following modifications
> * DOCTYPE is now recognized even in lower case.
> * We make sure to buffer the reader, eliminating a long-standing bug that would crash
on certain inputs, such as & followed by CR+LF.
> * The HTML scanner's table is precompiled at run time for efficiency, causing a 4x speedup
on large input documents.
> * ]] within a CDATA section no longer causes input to be discarded.
> * Remove bogus newline after printing children of the root element.
> * Allow the noscript element anywhere, the same as the script element.
> * Updated to the 2011 edition of the W3C character entity list.
> Additionally:
> Updated license with new home page for tagsoup
> Updated annotations to OSGi annotations
> Added the ability to specify additional features/properties for the parser
> Documented available settings
> Javadoc fixed
> Prepared for different parsers by renaming HtmlParserImpl and adding component properties
> Configuration improved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message