cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pier Fumagalli <>
Subject Re: [ANN] Thanks for it...
Date Sat, 22 May 2004 17:57:50 GMT
On 22 May 2004, at 09:13, Ugo Cei wrote:

> Il giorno 21/mag/04, alle 18:11, Pier Fumagalli ha scritto:
>> From this morning at 8:32 AM (BST) is running 
>> off a standard 2.1.5 (head) distribution of Cocoon, Apache 2.0.49 w/ 
>> mod_cache, Jetty 4.2.19, and a hint of my take on the Cocoon kernel 
>> empowering the backend XML data repository...
> Congratulations! Can you give us some more details? How many pages are 
> you serving daily and on which hardware, for instance? I think success 
> stories like yours are important to demonstrate that Cocoon is able to 
> serve lots of content with good performance.

Well, let's say that Cocoon is most definitely NOT the "performant" 
component on the site...

The pages are generated going through something like 2 megs of 
aggregated XML documents, and given the structure of the site (and the 
fact that we're still not 100% confident) we're using non-caching 

In other words, it takes us roughly between 1 and 2 seconds to generate 
one single HTML page (whoha, bessie)...

But it's all cached on the front end by Apache's mod_disk_cache, so, in 
terms of performance, we don't seem to hit major problems.

And seriously, we don't care much "how long" it takes to create a 
page... We're a news site, so the variation on URLs requested in a day 
is not much (currently my cache is filled up with something like 2000 
documents, even if you can have access to almost 100k articles on the 

And the architecture (with caching up front powered directly by Apache) 
allows us to withstand "slashdot-like" attacks very easily (the first 
one coming in generates the request, all the remaining freaks get the 
copy cached off on the disk)...

It was a weird change from JSPs because those were never cached, and we 
had to put a lot of effort in actually making the JSP engine and code 
"fast"... With Cocoon, well, we know we wouldn't have been able to, so 
we thought out other ways to deal with it, and (more importantly) it 
forced us to think to a better and more scalable architecture...

One example above all: advertisement tags... Before, a lot of the 
advertisement code was generated on the server on a PER REQUEST 
basis... Now, we can't do this anymore because of the load that that 
would put on our server, so, we had to re-engineer how to serve ads, 
relying (for instance) more on the client javascript engine... But the 
knowledge that _we_can_not_ pass through every single request to 
Cocoon, helped us in the sense that id made us aware of all those 
problems that (for instance) forbid us to deploy the same application 
on several different machines at the same time (so, no fault tolerance, 
no load balancing, no nothing)...

Now, the AMAZING thing, was the SPEED at which the site was 
developed... Three weeks for the whole shabang...

Do that with JSP, yeah, right! :-)

The severe and "restrictive" contracts that cocoon imposes to the users 
of its services might seem harsh at first (the, how do I do this, 
Cocoon doesn't do that syndrome was felt quite strongly at the 
beginning), but on the other hand, it forced us to _THINK_... To think 
about what we wanted our website to do, and how one single aspecto of 
it related with the rest of the site. Yes, we wrote some small hacks, 
or shortcuts, but amazingly enough, after the first 1 and a half weeks 
spent by Jerm getting all the information sorted out (with nothing 
moving forward and my manager freaking out), the rest of the 
functionality came out in the remaining two.... And we have a TON of 
pages up there...

It proved me (to my managers, and to the rest of the team) that 
limitations in contracts, and clear defined rules and boundaries out of 
which you cannot go to, even if they MIGHT seem counterproductive at 
first are clearly an advantage in developing and managing complex 

In terms of what you ask about performance and so on, I still don't 
have many figures but what I mentioned above... I know for sure that 
there's a HELL-OF-A-LOT that we can (and we will) improve, for now, we 
decided that no matter what, we had enough hardware to throw at the 
baby to match any possible requirement..

We started off thinking about 4 machines (HP/380s running Linux w/ 2 
Gigs o' ram and 2 3.2 gigs procs each, in other words, big stuff)... We 
already scaled down on only two of those (and we kept two not for 
performance, but for failover)...

In the future I think that we're going to use all four of them (once 
all the sites we host will be moved to Cocoon), but maybe separate out 
the hardware on classes of functionality (two for serving/caching, two 
for generating content), but we'll see how the baby adapts and how it 
behaves over the next few weeks...

For now I'm happy that it works, it works better than expected, and 
that the concepts behind the machinery are stronger than any possible 
performance hack you can possibly think of: if you need speed, even if 
Cocoon is not _THAT_ fast, you can get it to serve the heck out of it 
anyhow. You only need to  _THINK_ about your problems and not rely on 
some magic software to magically run your badly-designed 
web-application fast enough! :-P


View raw message