maven-doxia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent Massol <>
Subject Re: The need for multiple passes over a document
Date Tue, 29 Apr 2008 20:05:37 GMT

On Apr 29, 2008, at 1:56 AM, Vincent Siveton wrote:

> 2008/4/28 Jason van Zyl <>:
>> I was looking at the TOC macro and I feel what it's doing is wrong  
>> insofar
>> as requiring a second pass to get the structure of the document.
> Agree but we did it as best that we can :)
>> There are definitely cases where you need to make multiple passes  
>> and the
>> TOC macro is clearly one of them. Having to pass in the the whole  
>> source
>> document and the parser to make the TOC macro work seem extreme to  
>> me.
>> I think that we should declaratively say, or determine, that the  
>> structure
>> of the document is required by something in the page. Preprocess  
>> the page in
>> a general way and not require passing in the whole document and  
>> parser again
>> as that's pretty cumbersome for the implementor of a parser.
>> I also noticed that the parsers are not threadsafe, I don't believe  
>> this
>> was always the case and we should make them threadsafe again if  
>> it's true
>> they aren't. I just looked at the APT parser and it doesn't look  
>> threadsafe
>> to me but wouldn't take much to make it threadsafe.
> DefaultDoxia as a comment about thread safe...
>> I would like to take a pass at making the document structure  
>> requirement
>> more general to avoid things like we're doing in the TOC macro. I  
>> would also
>> like to take a pass at making the parsers threadsafe.
>> I think we should also just release 1.0 for the sake of the site  
>> plugin and
>> then move on with the next version of Doxia. We need to remove the  
>> coupling
>> of doxia to the site  plugin and move the core back to a simple set  
>> of
>> parsers and sinks.
> Sounds like a Doxia 2.0 :) I think Doxia has several limitations,
> specially for style. DOXIA-204 solved several of them but I think we
> could do more.

BTW if you're interested to follow what I'm doing in xwiki land it's  
available here:

Architecture/spec is here:

Basically I have the following main objects:
* Listener (Sink in Doxia speak)
* Parser
* Macro
* Transformation
* Document AST

The process is:
1) text is transformed in AST by Parser
2) the Transformation manager finds the list of transformation  
components to execute on the AST
3) One such transformation is called MacroTransformation and is in  
charge of looking for all MacroBlock blocks in the AST and executing  
them till there are no more MacroBlock (this allows nested Macros).  
Thus a Macro takes an AST as parameter and generates a list of Blocks.
4) The modified AST is then traversed (traverse()) with a Listener

Note1: XWiki can use both Doxia and Wikimodel transparently since it  
has a bridge to both. Right now the bridge I have is a Parser bridge  
where Doxia or Wikimodel parsers generate a XWiki Document AST. In  
this manner I'm reusing Doxia and WikiModel's parsers. My next step is  
to have a Sink bridge so that I can use Doxia sinks.

Note2: The events I have are finer grained since I have events at the  
Word level:
     void onWord(String word);
     void onSpace();
     void onSpecialSymbol(SpecialSymbol symbol);

Note3: Since I want to support generation of HTML elements from Macros  
I have an HTMLBlock element and the following Listener events:
     void beginXMLElement(String name, Map<String, String> attributes);
     void endXMLElement(String name, Map<String, String> attributes);


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message