cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joerg Heinicke <>
Subject Re: Avoiding OutOfMemory Errors by limiting data in pipeline
Date Sun, 11 May 2008 23:59:26 GMT
On 09.05.2008 09:41, Peter Hunsberger wrote:

>> I think this is rather hard to do. The place where we instantiate the
>> BufferedOutputStreams (both and o.a.c.util) is
>> AbstractEnvironment.getOutputStream(int bufferSize). So in order to pass a
>> second buffer size argument to the BufferedOutputStream constructor we need
>> to have it available there. One option would be to add it to
>> getOutputStream() - which is an interface change and not really nice.
> I haven't looked at the code here, but couldn't you just introduce a
> second getOutputStream( int bufferSize ) method where the current
> interface method continues with the current default logic if it is
> used?

getOutputStream() actually already takes an int parameter, the flush 
buffer size. Whether to add another getOutputStream() method or modify 
the existing one there is not really a difference IMO. Environment is a 
kind of internal interface (or SPI how it used to be called lately, 
isn't it?). This means there should be only very few implementations 
besides the one we provide if at all (Forrest, Lenya, CLI environment?). 
And in Cocoon we would change all usages of the single-parameterized 
method to the one with 2 parameters. So whoever provides such an 
Environment implementation has to adapt his implementation in a 
meaningful way anyway (empty implementation returning null, throwing 
NotSupportedException, whatever would not work). So it's the same effort 
for them whether to add a new method or changing existing one on the 

IMO the decision should be made purely from a design perspective. Should 
  a configuration parameter passed around as method parameter though it 
is static through the whole lifecycle of the Environment instance? In a 
perfect world I'd say no :) Which leaves the question how to inject the 
parameter. One place is on instantiation (e.g. 
CocoonServlet.getEnvironment(..) in 2.1, 
RequestProcessor.getEnvironment(..) in 2.2) which leaves us with the 
web.xml init parameter (or analogical alternatives for other 
environments) as described.

Another option I found is to setup the environment (i.e. injecting the 
parameter) while setting up the pipeline. AbstractProcessingPipeline is 
the place where we have access to the current flush buffer size 
parameter and call getOutputStream(..) on the environment. It has a 
method setupPipeline(Environment). Why not injecting the parameter(s) 
here? Due to its lifecycle changing the property of the environment 
should not cause any problem since it's a one-time usage object, no 
threading problems or something like that.

I'm just curious what the original reason was to pass the parameter 
along rather than injecting it? Maybe there is a flaw in my thoughts :) 
Whoever knows the code, are my statements correct and what do you think 
about the approach injecting the parameters rather than passing them 
along? Second, if it is a valid approach which way to go?

1) Don't provide a separate configuration option for initial buffer size.
2) Pass both parameters to getOutputStream(..).
3) Leave the current flush buffer size as parameter to 
getOutputStream(..) but inject the other one
a) from web.xml.
b) from pipeline configuration.
4) Inject both buffer sizes, eventually reactivating/reintroducing 
getOutputStream() without any parameter and deprecating the other one.

Many questions, yet another one: What do you think? :)


View raw message