forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <je...@apache.org>
Subject Re: Generating html from xml without putting a file in site.xml
Date Thu, 25 Sep 2003 22:55:15 GMT
On Thu, Sep 25, 2003 at 09:11:52PM +0200, Eric BURGHARD wrote:
> Le Jeudi 25 Septembre 2003 17:26, Jan.Materne@rzf.fin-nrw.de a ?crit :
> > Like Upayavira sais: the cli.xconf would be the right place.
> > Another possibility is to add a hidden link.  <a href="..."/>
> > Nobody can click on that, but the crawler finds it.
> >
> >
> > Jan
> >
> 
> Is there any differences between the result of the forrest site
> generation (with ant script) and the one from a site ripper tool like
> wget (which follow the links too).

In theory, yes.  Because the Forrest crawler has access to the content's
XML before it gets rendered, we can spider documents like PDFs or .swf's
for links.

But most of the time people render HTML where links are obvious, and wget
is just as good.

--Jeff

> 
> A+
> 

Mime
View raw message