hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jafarim <jafa...@gmail.com>
Subject Re: bandwidth (Was: Re: Running on multiple CPU's)
Date Mon, 16 Apr 2007 17:32:11 GMT
On linux and jvm6 with normal IDE disks and a giga ethernet switch with
corresponding NIC and with hadoop 0.9.11's HDFS. We wrote a C program by
using the native libs provided in the package but then we tested again with
distcp. The scenario was as follows:
We ran the test on a cluster with 1 node, then we added the nodes one by one
until reaching 5 nodes. Same test with samba saturated the link with only
one node.


On 4/16/07, Doug Cutting <cutting@apache.org> wrote:
> Please use a new subject when starting a new topic.
> jafarim wrote:
> > Sorry if being off topic, but we experienced a very low bandwidth with
> > hadoop while copying files to/from the cluster (some 1/100 comparing to
> > plain samba share). The bandwidth did not improve at all by adding nodes
> to
> > the cluster. At that time I thought that hadoop is not supposed to be
> used
> > for this purpose and did not use it for my project.
> > I am just curious how much scalable hadoop is and how bandwidth should
> grow
> > as nodes are added to the cluster.
> It's not clear to me what you tried.  Are you running HDFS?  On how
> large of a cluster?  What version of Hadoop?  What operating system?
> How were you copying files to/from the cluster?
> The 'bin/hadoop distcp' command should scale to consume available
> network bandwidth and disk i/o.
> Doug

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message