forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <>
Subject Re: Cocoon CLI - how to generate the whole site (Re: The Mythical Javadoc generator (Re: Conflict resolution))
Date Fri, 13 Dec 2002 17:06:18 GMT
On Fri, Dec 13, 2002 at 05:31:59PM +0100, Nicola Ken Barozzi wrote:
> Jeff Turner wrote:
> >The javadocs are _already_ generated, and <javadoc> has already put them
> >in build/site/apidocs/.  Now how is Cocoon (via the CLI) going to
> >"publish" them?
> Ok, now we finally get to the actual technical point. I will take this 
> discussion in a general way, because the issue is in fact quite general.
>                               -oOo-
> ATM, the Cocoon CLI system is completely crawler based. This means that
> it starts from a list of URLs, and "crawles" the site by getting the 
> links from these pages, putting them in the list, purging the visited 
> ones, and restrting the process with those.
> If we only have XML documents, the system can be made to be very fast 
> and semantically rich.
>   - fast
>    if we get the links while processing the file, we don't
>    have to reparse it later for the crawling
>   - semantically rich
>     we get the links not from the output, but from the real source.
>     In the sitemap, the source content, with all semantics, is
>     tagged and used for the link gathering. So we can even gather
>     links from an svg file that will become a jpeg image!
> Things start breaking a bit down when we have to use resources that are 
> not transformed to XML. Examples are CSS and massive docs to be included 
> like javadocs.
> The problem is not *reading* this files via Cocoon, but getting the 
> links from them. In the case of CSS we need the links, in case of 
> Javadocs, we know the dir structure and eventually would not need them.
> For the CSS, the best thing is actually parsing them and passing them in 
> the SAX pipeline. I see no technical nor conceptual problem with it.
> The problem arises when we need to pass files in "bulk". In this case 
> they are javadocs, but what about jars, binaries, images, all things 
> that are not necessarily linked in the site, or that we simply want to 
> dump in the resulting system?
> This is the answer that I seek.

There is only one answer.

We've established that Cocoon is not going to be invoking Javadoc.  That
means that the user could generate the Javadocs _after_ they generate the
Cocoon docs.

To handle this possibility, the only course of action is to ignore links
to external directories like Javadocs.  What alternative is there?

One thing we could do, is record all 'unprocessable' links in an external
file, and then the Ant script responsible for invoking Cocoon can look at
that, and ensure that the links won't break.  For example, say Cocoon
encounters an unprocessable '' link.  Cocoon records
that in unprocessed-files.txt, and otherwise ignore it.  Then, after the
<java> task has finished running Cocoon, an Ant task examines
unprocessed-files.txt, and if any java: links are recorded, it invokes a
Javadoc task.

So we have a kind of loose coupling between Cocoon and other doc
generators.  Cocoon isn't _responsible_ for generating Javadocs, but it
can _cause_ Javadocs to be generated, by recording that fact that it
encountered a java: link and couldn't handle it.


View raw message