hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma" <jssa...@facebook.com>
Subject RE: Overhead of Java?
Date Thu, 06 Sep 2007 12:48:37 GMT
Came across an interesting site on this topic:
http://shootout.alioth.debian.org/

There's a 'sum-file' benchmark - that's probably what a lot of people do
in hadoop. The difference across different processors is interesting.

-----Original Message-----
From: Curt Cox [mailto:curtcox@gmail.com] 
Sent: Wednesday, September 05, 2007 2:23 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Overhead of Java?

There is no simple answer to that question.  Some important factors
that spring to mind are:
1) What is the architecture? -- x86, SPARC, etc...
2) What is the operating system?
3) What Java runtime is being used?
4) What C compiler is it being compared against?
5) How long does job run?
6) What are the memory constraints?
7) How much (if any) floating point work is being done?
8) What sort of IO is being done?
9) How good are you with a profiler?

I've probably missed some important ones.  Also, what do you mean by
systems programming?  That phrase says kernels and device drivers to
me, but I'm sure that's not what you mean.

With a sufficiently stacked deck, either Java or C would handily
trounce the other.  I suspect that the average Hadoop job gives a
slight edge to Java, but I have nothing to back that up.

Mime
View raw message