forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Savory <and...@luminas.co.uk>
Subject Re: file: implemented (Re: cvs commit: ...)
Date Fri, 13 Dec 2002 09:53:07 GMT

On Fri, 13 Dec 2002, Jeff Turner wrote:

> Because in the long run,  I would prefer to develop a separate wget-like
> tool with cocoon-view hacks added to it, than to develop the CLI into a
> full-blown threaded crawler.  Why?  Because a separate tool has a _much_
> larger audience, so will evolve faster.  Yes, a Cocoon CLI may be more
> elegant, but a separate tool can grow geometrically while the CLI grows
> linearly.

I can see some serious advantages to splitting the crawler from the CLI:
when the crawler is there, it would be fantastic to add a "precacher"
using the crawler (go hit my entire site, including internal cocoon-views)
rather than the "traditional" approach of running wget on a site. I
suspect various other things that rely on crawling (such as search
implementations like the Lucene code) would benefit from the speed
increase of a dedicated crawler, too.

I think it would be best done as part of Cocoon rather than Forrest though
(or am I missing the point *again*? ;-), as there are more ways it would
be used there.


Andrew.

-- 
Andrew Savory                                Email: andrew@luminas.co.uk
Managing Director                              Tel:  +44 (0)870 741 6658
Luminas Internet Applications                  Fax:  +44 (0)700 598 1135
This is not an official statement or order.    Web:    www.luminas.co.uk


Mime
View raw message