Mailing-List: contact river-dev-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: river-dev@incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of pats@acm.org designates
 209.86.89.62 as permitted sender)
Message-ID: <4C5626B2.1020206@acm.org>
Date: Sun, 01 Aug 2010 19:00:18 -0700
From: Patricia Shanahan <pats@acm.org>
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: river-dev@incubator.apache.org
Subject: Re: TaskManager thread limit
References: <4C506C21.5060705@acm.org> <4C54F27D.2020406@zeus.net.au>
 <4C561CFE.6060406@wonderly.org>
In-Reply-To: <4C561CFE.6060406@wonderly.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Gregg Wonderly wrote:
> Peter Firmstone wrote:
>> Patricia Shanahan wrote:
>>> In addition to a limit related to the number of runnable tasks, a 
>>> TaskManager has a hard limit on the number of threads it will create.
>>>
>>> The parameterless constructor has limit 10, and most other uses have 
>>> compile time limits in the range of 10 to 15 threads.
>>>
>>> com.sun.jini.reggie.RegisrarImpl has a compile time limit of 50 threads.
>>>
>>> com.sun.jini.mahalo.TxnManagerImpl creates two pools, settlerpool and 
>>> taskpool. The settlerpool has limit 150 threads. The taskpool has 
>>> limit 50 threads.
>>>
>>> Even 150 threads could be low in a large server, especially if the 
>>> threads are used to wait for anything, so that each thread does not 
>>> need a hardware thread for a significant fraction of its life.
>>>
>>> As noted in the NIO vs. IO discussion that Peter pointed out, the key 
>>> to getting good performance simply is to take advantage of the fact 
>>> that an idle thread is a cheap, simple way to remember the state of 
>>> some activity.
>>>
>>> There is one approach that would minimize the number of changes but 
>>> increase flexibility. We could redefine the maximum thread count as 
>>> being the maximum number of threads per X, where X is a measure of 
>>> system size with a minimum of 1, but increasing on large systems.
>>>
>>> X could be based on the number of processors, the maximum heap 
>>> memory, or some combination.
>>
>> I'm in favour of this suggestion.
> 
> Auto sizing is good, but we should also consider putting in logging that 
> announces when the limit is reached and waiting will occur.  This will 
> help people diagnose the potential system pauses, if not deadlock that 
> will be outwardly visible.

Agreed. I would be interested both in the distribution of times that 
tasks spend waiting for other tasks and in the distribution of times 
that they spend as ready tasks, waiting for a thread.

> On top of that, I'd be strongly in favor the introduction of JMX as an 
> external observation and management capability for these types of values.

I don't know much about JMX, but I strongly agree with external 
observation and management.

Incidentally, do you know why the default load level is 3, not 1?

Patricia