hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject RE: Hadoop throughput question
Date Thu, 03 Jan 2013 22:15:28 GMT
Let's suppose you are doing a read-intensive job like, for example, counting records.  This
is will be disk bandwidth limited.  On a 4-node cluster with 2 local SATA on each node you
should easily read 400MB/sec in aggregate.  When you are running the Hadoop cluster, is the
Hadoop processing co-located with the Ilsilon nodes?  Is Hadoop configured to use OneFS or
HDFS?
John

From: Artem Ervits [mailto:are9004@nyp.org]
Sent: Thursday, January 03, 2013 3:00 PM
To: user@hadoop.apache.org
Subject: Hadoop throughput question

Hello all,

I'd like to pick the community brain on average throughput speeds for a moderately specced
4-node Hadoop cluster with 1GigE networking. Is it reasonable to expect constant average speeds
of 150-200mb/sec on such setup? Forgive me if the question is loaded but we're Hadoop cluster
with HDFS served via EMC Isilon storage. We're getting about 30mb/sec with our machines and
we do not see a difference in job speed between 2 node cluster and 4 node cluster.

Thank you.



Mime
View raw message