cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simone Tripodi (JIRA)" <j...@apache.org>
Subject [jira] Updated: (COCOON3-5) Add an HTML2XHTML converter as Starter
Date Tue, 14 Oct 2008 13:56:44 GMT

     [ https://issues.apache.org/jira/browse/COCOON3-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simone Tripodi updated COCOON3-5:
---------------------------------

    Attachment: NekoGenerator.patch

The attached patch contains an easy implementations that uses CyberNeko (http://nekohtml.sourceforge.net/).
Like others generators, works with SAX apis starting from a ZAXParser instances and notifies
SAX events to the xmlConsumer.
A simple testcase has been also implemented.

> Add an HTML2XHTML converter as Starter
> --------------------------------------
>
>                 Key: COCOON3-5
>                 URL: https://issues.apache.org/jira/browse/COCOON3-5
>             Project: Cocoon 3
>          Issue Type: Improvement
>          Components: cocoon-optional
>    Affects Versions: 3.0.0-alpha-2
>            Reporter: Simone Tripodi
>            Assignee: Cocoon Developers Team
>            Priority: Minor
>             Fix For: 3.0.0-alpha-2
>
>         Attachments: NekoGenerator.patch
>
>
> This starter component for the pipeline is a component that transform an HTML content,
taken by the specified URL, and transform it in XHTML or, at least, a well-formed XML document.
> So now the original document can be processed in the pipeline in various ways:
>  * following links;
>  * implementing crwalers;
>  * easy transforming the original document in other various formats;
>  * etc... 
> I want to explain the need of this component with a testcase; last week I had to face
a singular problem, realizing a simple service that takes in input an HTML page's URL, and
transform it , through the Optimus' XSLT (http://microformatique.com/optimus - http://code.google.com/p/mf-optimus/source/browse/#svn/trunk/xsl)
in an XML document that contains the original doc's Microformats, in an easier and more parsable
formats.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message