forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gritsenko <vadim.gritse...@verizon.net>
Subject Re: Link Crawling?
Date Mon, 04 Nov 2002 16:14:17 GMT
Nicola Ken Barozzi wrote:

>
> Vadim Gritsenko wrote:
>
>> Nicola Ken Barozzi wrote:
>>
>>>
>>> Peter Donald wrote:
>>> > Hi,
>>> >
>>> > Is there anyway I can add more strategies for link crawling during 
>>> CLI
>>> > operation? In particular I have a css sheet that has
>>> >
>>> > @import url("blah.css");
>>> >
>>> > but this wont ever be copied across because it is not crawled.
>>> >
>>> > Suggestions?
>>>
>>> Basically, the whole Cocoon CLI system has been hacked away by Stefano
>>> and also Gianugo, and not much touched since then.
>>>
>>> It has been neglected for long, and as you know too well from the use
>>> you made on Avalon site, it stopped at every single problem with links
>>> it had, which BTW has never been the intention of the original writers.
>>>
>>> Lately I have tweaked it to output better info to the user and not to
>>> break on broken links.
>>> It still needs more work though.
>>>
>>> For now you have two options: include that link in the html as an
>>> attribute to a tag (try <!-- <a href="blah.css"/> --> ) or patch
the
>>> Cocoon CLI which is Main.java and many other classes.
>>
>>
>> Actually, whole link extraction logic is in LinkSerializer and its 
>> parents.
>
>
> Actually there is some link processing in Main.java, look at
>
>   public Collection processURI(String uri) throws Exception {...
>
> to see what I mean. 


Sorry, Ken, I don't see what you see: this method just collects & 
translates URIs returned by LinkSamplingEnv (== LinkSerializer) and 
tries to come up with the file name for it...


PS I've spent some time on this method, trust me
http://cvs.apache.org/viewcvs.cgi/xml-cocoon2/src/org/apache/cocoon/Attic/Main.java?rev=1.5&content-type=text/vnd.viewcvs-markup
:)

Vadim




Mime
View raw message