river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Trasuk <tras...@stratuscom.com>
Subject Re: Concurrency and River
Date Mon, 01 Oct 2007 04:10:02 GMT
Hi Bill and list:

Emphatically yes.  See comments below.



On Sat, 2007-09-29 at 11:07, Bill Venners wrote:
> Hi All,
> I've been observing some wailing and gnashing of teeth lately in  
> various programming communities around what to do about the rise of  
> multi-core processors.
> <snip>

> So here's my question. I get the feeling that the trend to multi-core  
> architectures represents a disruptive technology shift that will  
> shake up the software industry to some extent. Does River have  
> something to offer here? If you expect the chips your software will  
> run on will have multiple cores, and maybe you don't know how many  
> until your program starts running, you'll want to organize your  
> software so it distributes processing across those cores dynamically.  
> Isn't JavaSpaces a good way to do that?

On a recent project that was doing substantial of data analysis and
corellation, the client had a radically overspeced (for two years ago)
server to deploy on.  Dual Xeon with Hyperthreading and 4GB of memory. 
With the hyperthreading, Windows Task Manager showed four processors. 
In the two years I had been watching this particular server on other
projects, I had never seen two processors spiked simultaneously, and I
had never seen a sustained period of utilization over 10-15%.

I decided that for this analysis-heavy project, I was going to make
those dual processors warm up!  I built the application as a data-flow
engine using Javaspaces, and simply set the number of concurrent workers
on the spaces high enough to load the processors (aside; since the
workers would often be waiting on entries to be completed by other
workers, it's not quite as simple as nworkers=ncpus).  I figured it was
just an added bonus that I could add more workers on a different node if
necessary.  I actually tested additional nodes and found that yes, they
did improve throughput, but we didn't need it in this case.

Anyway, when you hit the application with a query, you definitely see
all four CPU tracks go to 100%.  Unfortunately, they only stay at 100%
about 25% of the time; the rest of the time they're waiting for the
Oracle database on another machine.

But definitely, Javaspaces is a good way to get all your cores working. 
It may be that for smaller data packets, you might want to optimize the
communications (i.e. do some kind of in-memory communications rather
than going through the TCP/IP stack), but I suspect you'd have to have
very small packets for that to make much of a difference.

> Anyway, I was curious what everyone here thought. It may be a way to  
> position River in people's minds, give it a marketing story.
Agreed; it's a decent hook.

Greg Trasuk, President
StratusCom Manufacturing Systems Inc. - We use information technology to
solve business problems on your plant floor.

View raw message