forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Ken Barozzi <>
Subject Re: cocoon crawler, wget, the problem of extracting links
Date Fri, 13 Dec 2002 16:05:45 GMT

Steven Noels wrote:
> Bruno Dumon wrote:
>> Another solution would be to make a list of URL's for all these files
>> and feed that to the crawler. The thing that makes this list would of
>> course need to have some assumptions on how files on the filesystem or
>> mapped in the URL space.
> Or vice-versa.
> I'm still stuck with this idea of having a LinkResolverTranformer which, 
> given a configuration of schemes and their respective source resolution, 
> would rewrite links as needed. It might be "boneheaded me", and 
> orthogonal/supplementary to the sitemap and what is currently put 
> forward, but I want to do my thinking in public.


> Does this make sense at all?

Yes, it does.

It's exactly the same concept in my "Concern 1" section about link 
lookup and resolving. I modeled it as an action, but forgot to add the 
transformation of links too, that you explain here.

+1 (about the concept, we will see what makes more sense 

Nicola Ken Barozzi         
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)

View raw message