cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Ken Barozzi <nicola...@apache.org>
Subject Re: CLI ideas (long)
Date Tue, 01 Apr 2003 12:32:46 GMT


Upayavira wrote, On 31/03/2003 21.11:
> Dear All,
>
> Below is the a summary of a brief exchange with Nicola Ken 
> regarding CLI ideas I'd like to implement. He has encouraged 
> me to 'go public', which I am now doing.  

Hey, nobody wants to comment on the CLI changes?
Or is it that we are doing it too well? ;-)

> My aim in the below is twofold: make the CLI into something 
> that is useful to a project I am working on, and also to make 
> the CLI into something that people would prefer to use as 
> opposed to something like wget. [Confession: I'm afraid I 
> still use wget myself.]

:-))

<snip/>
> [Nicola Ken - I didn't understand this bit of your reply:]
> 
>>Actually even the former is managed by Cocoon, I don't remember where but
>>IIRC the Environment has such an info, only that in the current
>>implementation of the CLI environments it's unimplemented.

I mean that the hook are already there, you just have to fill in the 
implementation.

In the Environment there is

     boolean isResponseModified(long lastModified);
     void setResponseIsNotModified();

But it's never implemented. In AbstractEnvironment:

     public boolean isResponseModified(long lastModified) {
         return true; // always modified
     }

     public void setResponseIsNotModified() {
         // does nothing
     }

So it means that the above has to be first implemented, then used when 
writing to disk.

> As Nicola Ken pointed out, links of every page would need to be cached, because 
> when a page will be found to be already on disk and uptodate, you still need the 
> links for crawling. Hmm. 

Yup.

> ---Threading---
> Threadinq needs reworking as the ThreadedDestination would become
> deprecated. 
...
> There are two possible forms of threading: generation and dispatch
> threading. 
...
> This kind of threading is important for a system that I want to use it for. 
> The pages bear no relevance to each other, and speed of delivery is 
> important.  (I don't plan to implement generation threading ATM).
...
> Final comments from Nicola Ken:
> 
>>What about a publish-subscribe model, with complete decoupling from
>>the publishing and the handling?
> 
> Can you explain more what you mean by this?

I was thinking of a messaging system, like JMS for example, but it's 
overkill.

Go ahead with your needs.

>>As points that are important, I would say in order:
>>
>>  1) make Cocoon *not* output the pages that have an error
>>  2) make cocoon output xxxpagename.error.txt with the errors
>>     of the 'xxxpagename' page (configurable)
>>  3) make the report on broken links in XML so that it can be
>>     added to the site (where to put it configurable)
>>  4) make the content not regenerated if uptodate (very important
>>     from a user perspective POV)
>>  5) use ModifyableSource instead of Destination
>>  6) others
>>
>>Feel free to do whatever in whatever order you prefer, this is just
>>what IMVHO is the priority. 1+2 are needed BTW so that crawlers see
>>broken links correctly, otherwise the site seems ok but instead the
>>broken links are there.
> 
> 
> Do you have ideas as to how to do these (i.e. 1-4)? 5 is of greatest importance to 
> me, but if I can understand what is involved in the others, then I can always have a

> go.

Leave 2 and 3 out then for now.

1 is about not making error pages be printed out... for one thing IIUC 
it needs resourceUnavailable() to be configurable (write out or not), 
but I don't know if maybe there are other errors that write directly.

4 is quite important from a user perspective, but maybe it takes some 
time to do.

Feel really free in doing what you need/prefer, especially if other 
things take you too much time.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Mime
View raw message