river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject Re: TaskManager progress
Date Tue, 27 Jul 2010 21:23:29 GMT
Food for thought?


Patricia Shanahan wrote:
> Gregg Wonderly wrote:
>> Patricia Shanahan wrote:
>>> On 7/21/2010 12:58 PM, Gregg Wonderly wrote:
>>> ...
>>>> When I write code of this nature, attempting to remove all 
>>>> contention, I
>>>> try
>>>> to list every "step" that changes the "view" of the world, and 
>>>> think about
>>>> how that "view" can be made atomic by using explicit ordering of 
>>>> statements
>>>> rather than synchronized{} blocks.  ...
>>> I would like to discuss how to approach performance improvement, and 
>>> especially scaling improvement. We seem to have different 
>>> philosophies, and I'm interested in understanding other people's 
>>> approaches to programming.
>>> I try to first find the really big wins, which are almost always 
>>> data structure and algorithm changes. That should result in code 
>>> that is efficient in terms of total CPU time and memory. During that 
>>> part of the process, I prefer to keep the concurrency design as 
>>> simple as possible, which in Java often means using synchronization 
>>> at a coarse level, such as synchronization on a TaskManager instance.
>> I don't think we are too far different.  The big wins are the ones to 
>> go for. What I've learned over the years, debugging Java performance 
>> issues in Jini applications, and elsewhere, is that "synchronized", 
>> while the most "correct" choice in many cases, is also the "slowest" 
>> form of concurrency control.
> Yes, I think it is merely a matter of emphasis. One problem I've seen 
> is a tendency to accept the times without questioning whether they 
> need to be as long as they are, and focus too much too soon on 
> concurrency, before making it fast, so I tend to resist that by 
> keeping the concurrency simple until the data structures and 
> algorithms are right.
>> In a "service" application in particular, it can, in many cases, be 
>> the case that the total CPU time needed to perform the actual work, 
>> is a smaller fraction of the time that "synchronized" injects as 
>> latency in the execution path.  I think you understand this issue, 
>> but I want to illustrate it to make sure.
>> File I/O and network I/O latency can create similar circumstances.  
>> If you look at this with some numbers (I put ms but any magnitude 
>> will show the same behavior) such as the following:
>> 2ms to enter server (Security/Permission stuff)
>> 1ms CPU to arrive at synchronized
>> 3ms through synchronized lock
>> 1ms to return result
>> Then if there are 30 such threads running through the server, 
>> eventually, all of them will be standing in line at the synchronized 
>> (monitor entry) because they can get through all the other stuff in 
>> short order.  As the total number of threads increases, the 
>> "synchronized" section time must be multiplied by the number of 
>> threads, and so with 10 threads, it becomes 30ms of time, because 
>> each thread must wait it's turn.  Thus, 30ms becomes the minimum latency
>> through that part of the system, instead of the 3ms there would be 
>> with 1 thread.
> Suppose a better choice of algorithm and data structures could reduce 
> the time with the synchronized lock from 3ms to 3us. Wouldn't it be 
> better to do that first? With 10 threads, eliminating synchronization 
> may, at best, produce a 10x performance improvement. Picking the right 
> data structures and algorithms can often do better than that.
> An ArrayList is not an efficient representation of a queue, especially 
> under load conditions that may cause a significant backlog.
> I feel that doing some optimizations, including a lot of the 
> concurrency changes, can create complication and commitment that 
> becomes a barrier to algorithm and data structure changes. On the 
> other hand, picking good data structures and algorithms does nothing 
> to bar improving concurrency later.
>> So, eliminating synchronized as a "global" encounter is what I always 
>> consider first.
> I, on the other hand, prefer to minimize the work being done first, 
> and then decide whether there is enough left for synchronization to be 
> a potential bottleneck.
> Note that there is a finite limit to the number of hardware threads 
> that can be supported efficiently, at least with current technology, 
> while maintaining the Java memory model. Really large clusters run as 
> multiple separate SMP computers, each with one or more JVMs of its 
> own, with message passing rather than shared memory communication 
> between the SMP computers.
> Patricia

View raw message