cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Upayavira" ...@upaya.co.uk>
Subject Re: Extending the Bean (non-HTML)
Date Thu, 14 Aug 2003 19:27:11 GMT
Vadim wrote:

> OT: Have you tried mozilla mail client?

Installing as we speak.

> > * split the bean into a CocoonWrapper that handles configuring a
> > Cocoon object  
> >and handling a single request, and a CocoonBean which handles
> >crawling 
> >
> 
> What is the API of these new beans? Please do not forget that
> CocoonBean is out of the door with 2.1 release and people (might be)
> already building applications with CocoonBean, meaning, you can't
> change CocoonBean API in backward incompatible way without proper
> deprecating and support of released functionality.

But we did document that the API of the bean was unstable. Doesn't that mean we 
can change the API where necessary? Of course we should minimise it as much as 
possible. Therefore, I'll redo what I've done so far, being more thorough about 
ensuring compatibility.

I'm sure I can manage the split into two classes (which I think greatly aids clarity) 
without breaking any interfaces.

> > * Made the CocoonBean use a Crawler class (derived from the one in
> > the  
> >scratchpad Ant task)
> 
> Do you mean org.apache.cocoon.components.crawler.Crawler? I don't see
> how it can be used in CocoonBean. Can you elaborate?

No. There's a scratchpad Ant task which has its own crawler. I used that. I'd like to 
use o.a.c.components.crawler.Crawler, but I couldn't see how to do it, because it has 
its own link gathering code built into it.

> > * Moved all of the URI logic (mangling URIs etc) into the Target
> > class
> 
> Sounds good.
> 
> > * made it report the time taken to generate a single page
> 
> Ok.
>
> >Next I want to: 
> >
> > * moving the member variables of the wrapper and bean into a Context
> > object, so  
> >that the Bean can be used in a ThreadSafe environment.
> 
> AFAIU, CocoonBean.processURI is already thread safe. All addTarget()
> methods are obviously not. addTarget() methods can easily be made
> threadsafe (in some sense -- call to addTarget in one thread does not
> break bean but affects process() running in another thread) by
> synchronyzing access to the targets collection. It can be thread safe
> in another sense too (calls to processTargets in different threads are
> independent of each other): you just need to add
> processTargets(targets) method.

All of the crawler data is in member variables that will be shared between threads. 
Therefore processTargets(targets) wouldn't in itself be enough.

I can add a crawler in which encapsulates the necessary data. Then a 
processTargets(targets) could be threadsafe.

> > * rework the way the bean is configured (possibly using
> > Configuration objects)
> 
> Why would you need those Configuration objects?

Er. Good point :-)

I'll stick with what we've got until we've got a good reason to change it. (The original,

now redundant, reason for this was to share xconf reading code between Main.java 
and an Ant class, but that isn't really possible as far as I can see).

> > * improve reporting so that it reports pages generated, time taken
> > per page, the  
> >links found in a page, stack trace from errors, pages that contain
> >broken links, and  more.
>
> Ok.
>
> >  * Make this reporting use SAX (to a file), so that in future it can
> >  be the basis of a  
> >publishing service
> 
> I think that's overkill. Especially writing to the file part. Extend
> BeanListener interface if you like, implement FileBeanListener if you
> need, but I don't think SAX is really what you need here.

Again, I'll leave this until I have a real need.

> > * Get caching working properly, and make it use ifModifiedSince() to
> > determine  
> >whether to save the file or not.
> 
> Must-have feature. Top priority. I hope you've seen my emails on
> persistent store subject.

I certainly did. I got your code, and downloaded and compiled the latest Excalibur 
Store. Unfortunately, on first tests, the CLI seems to have actually got slower. I did 
those tests without stepping through the code, so I've got to check out more of what's 
going on. I agree this is a top priority. I guess I just got a little downhearted at those

results and needed a few days to recover my enthusiasm!

> > * Build a simple Ant task to replace Main.java for ant driven
> > processes
>
> Good.
> 
> > * Make Cocoon work with an external Cocoon object, again for the
> > sake of a  
> >PublishingService
> 
> I don't get this. What Cocoon with which external Cocoon?

This is something that Unico talked about in relation to a publishing service running 
within a Cocoon servlet. Again, I'll wait until we've got an actual plan for such a 
service.

> > * replace the contents of the cli.xconf file with correct settings
> > for generating  
> >documentation from the built webapp, keeping the documentation system
> >working
> 
> Don't know what you mean.

At the moment, $COCOON/cli.xconf is set up for use by the documentation building 
system (in build/cocoon-x.x/documentation/). That is a very specific use, and thus 
should have a cli.xconf of its own (if that system is still required). The cli.xconf in the

root should, IMO, show how to generate sites from within build/webapp, for example 
generating from the documentation that is in build/webapp/docs/. That would be 
much more sensible for users trying to work out how to use a cli.xconf to configure 
the CLI.

> > * implement exclude/include, a la Ant in the cli.xconf
> 
> Ok.
> 
> > * make it configurable as to which pages are scanned for links (why
> > generate  
> >/docs/logo.gif?cocoon-view=links)?
> 
> Set of extensions which are not quieried for the links (configuration
> parameter don't-follow-links=gif, jpg, png)?

Exactly.

> > * work out how to implement Vadim's idea for a single pipeline with
> > an  
> >XMLTeePipe to generate both a link view and page view in one hit
> 
> Yep. Should increase performance and conformance!

I've spent some time trying to work out how to do this. It seems quite complicated. As 
each pipeline, when built, is made up of generator, set of translators and serializer, to

build a pipeline which splits into two, one half completing normally and the other going 
off into a separate 'link-view' pipeline, would require a specifically built Pipeline class,

and would require changes to the treeprocessor to be able to build it. Am I right, or do 
you know of a simpler way?

> > * improve the cli.xconf format to be more flexible, e.g: generate
> > multiple pages to  
> >a single destination, and to have links followed on some pages but
> >not others, etc
> 
> Ok.
> 
> >Phew. More than I thought! And there's more I haven't mentioned...
> 
> I'm scared! :)

No need to worry, I'm going to follow your incremental steps idea, so you'll see it all 
as it comes along :-)

Thanks for taking the time to reply. I appreciate it.

Regards, Upayavira


Mime
View raw message