incubator-jspwiki-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ichiro Furusato <ichiro.furus...@gmail.com>
Subject Re: From Plugin to something completely different... that users need!
Date Thu, 15 Nov 2012 04:07:51 GMT
It occurs to me I didn't actually answer you principal question...

On Wed, Nov 14, 2012 at 11:26 PM, Christophe Dupriez <dupriez@destin.be> wrote:
[...]
> Excuse me to not share your enthusiasm for STAX: it is essential for big
> documents (I use it for that: RDF to XML transformations...) but WikiPages
> are not that long and templates are hard enough to keep them unconstrained.
> Anyway the main problem today is to DEFINE the process to translate
> (normalize) XHTML into WikiMarkup. XSLT is certainly a way to experiment
> (and share results). Let's start something like bringing together test
> cases?

I think if you were to look at an XSLT approach it's not quite
so bad. Whereas in XHtmlElementToWikiTranslator.java you see

 else if( n.equals( "h2" ) )
  {
      m_out.print( "\n!!! " );
      print( e );
      m_out.println();
  }

basically the pattern in XSLT would be something akin to:

    <xsl:template name="h2">
        <xsl:text>
!!! </xsl:text>
      <xsl:apply-templates/>
    </xsl:template>

where we match via an XPath and output WikiMarkup. It would also
be a lot more reliable. The big question might seem to be whether
or not the input XHTML is truly well-formed XML or not. By definition
in XHTML it *must* be but of course in the real world that might
not be the case, and the XSLT wouldn't accept non-WF XML. But I'm
assuming that given the input to XHtmlElementToWikiTranslator.java
is a DOM Document we're already past that hurdle.

So what we'd need to do to define the transformation would be to
have a set of XPaths (particular markup patterns in XHTML) and what
each XPath would generate in WikiMarkup.  If we were to go to that
trouble the XSLT solution would be almost a byproduct of that work.

Ichiro

Mime
View raw message