any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From HansBrende <...@git.apache.org>
Subject [GitHub] any23 pull request #59: ANY23-326 fixed rdfa issue with unclosed input & met...
Date Wed, 24 Jan 2018 12:48:33 GMT
GitHub user HansBrende opened a pull request:

    https://github.com/apache/any23/pull/59

    ANY23-326 fixed rdfa issue with unclosed input & meta tags

    This PR should also fix ANY23-317, ANY23-273, ANY23-267, ANY23-271, and ANY23-227 (this
time, for realz).
    
    These all have to do with the RDFa implementation failing to parse HTML.
    
    My previous commit attempted to fix these issues by changing the default parser from NekoHTML
to Jsoup. But alas, it turns out the RDFa implementation is using a completely different html
parser under the hood, and it's the RDFa parser that's too strict, not ours, so changing ours
from NekoHTML to Jsoup had no effect (although it did come with a nice 20% speed increase,
so there's that). It seems that, for rio parsers, the document is parsed with Jsoup *only
to get the document language* and then parsed **again** under the hood with who knows what.
    
    Now, I simply check the RDF format to see if we're putting out XHTML. If we are, I first
XHTML-ify the stream with Jsoup before sending it on to the rio RDF parser.
    
    mvn clean install -> all tests passed.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HansBrende/any23 ANY23-326

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/any23/pull/59.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #59
    
----
commit 74b2909b6d91cc4989093d90a38baef1c34c603f
Author: Hans <firedrake93@...>
Date:   2018-01-24T12:26:40Z

    ANY23-326 fixed rdfa issue with unclosed input & meta tags

----


---

Mime
View raw message