maven-doxia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason van Zyl <>
Subject Re: Doxia Parsing API?
Date Mon, 31 Dec 2007 22:16:29 GMT

On 31 Dec 07, at 2:45 AM 31 Dec 07, Vincent Massol wrote:

> On Dec 27, 2007, at 11:20 AM, Vincent Massol wrote:
>> Hi Juan,
>> Thanks for your email and sorry for my late answer, I've just seen  
>> the mails now.
>> I've started using the confluence parser as a starting point for  
>> writing the XWiki parser. Re the speed, the confluence parser also  
>> generates a Block Tree but I'm not sure how this affects  
>> performance negatively.
> I can answer that... It'll matter for large documents since users  
> would not start to see anything output before the end of the  
> parsing. Modifying the parser to call traverse() whenever a block is  
> created would be very easy to do though. I think I might add a flag  
> for the xwiki parser to decide what to do.

Yah, modifying the parser to be more efficient would not be a problem.  
I was never dealing with anything more then a couple k.

> However there are some cases where the full parsing is required. For  
> example the XWiki TOC macro requires the full parsing to be done  
> since it needs to know all the section headers. Of course a second  
> level parsing could also be done, looking only for headers but that  
> would affect the performance a bit. So for all macros that require  
> the document structure for rendering we need the full parsing to be  
> done first. However it's hard to know quickly if the document  
> contains macros macros that work on the document structure and thus  
> we might have to parse the whole doc anyway...
> Thanks
> -Vincent
>> FWIW I've run some quick tests between the JavaCC-generated parser  
>> for XWiki that is in the wikimodel parser vs the "hand-written"  
>> Confluence parser in Doxia (since confluence and xwiki are of  
>> similar complexity for their syntaxes) and the result I got so far  
>> is that the "hand-written" parser is faster so I've gone ahead and  
>> used the "hand-written" confluence parser as a starting point.
>> Thanks again
>> -Vincent
>> On Dec 19, 2007, at 5:01 PM, Juan F. Codagnone wrote:
>>> Hi Vicent,
>>> On Wednesday 19 December 2007, Vincent Massol wrote:
>>> ...
>>>> I'd like to implement a Doxia parser for XWiki. However I've  
>>>> noticed
>>>> there's no standard in Doxia yet for parsing. Actually looking at
>>>> Doxia confluence, twiki and Apt I see each does it with his own  
>>>> code.
>>>> However the Confluence and TWiki implementations are very similar,
>>>> each defining Block, BlockParser, etc.
>>> ...
>>>> content). Does anyone have any idea how the Confluence parser  
>>>> compares
>>>> for example with, say, a JavaCC-generated parser?
>>> The confluence parser was made after the twiki parser by Jason.
>>> When i first wrote the twiki parser i felt that it was easier to  
>>> make an adhoc
>>> parser instead of a generated one for a language that has many  
>>> exceptions.
>>> (Also i was also reading a TDD book at that time, and i wanted to  
>>> make some
>>> practice, and the adhoc parser was perfect)
>>> Here is the original post
>>> Two years later i think it was a good decision. One developer that  
>>> never saw
>>> the original code was conforable adding new language feature and  
>>> bugfixes.
>>> In terms of of fast rendering mechanism, the twiki parser has a  
>>> draback: it
>>> first builds a block tree (like a DOM tree), and then the block  
>>> generates the
>>> events for the Sink.
>>> Juan.
>>> -- 
>>> Buenos Aires, Argentina                            22°C with winds  
>>> at 9 km/h E



Jason van Zyl
Founder,  Apache Maven
jason at sonatype dot com

believe nothing, no matter where you read it,
or who has said it,
not even if i have said it,
unless it agrees with your own reason
and your own common sense.

-- Buddha 

View raw message