hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: hadoop becomes very slow in Eucalyptus VMs
Date Thu, 01 Sep 2011 09:50:36 GMT
On 29/08/11 16:32, Shi Yu wrote:
> I installed hadoop-0.20.2 in Eucalyptus VM environment. The file system
> is based on glusterfs, so it is a shared NAS. Though the nodes are much
> powerful (8 cores + 15G memory), I found the response of hadoop namenode
> and data nodes became very slow. For example, after running
> start-all.sh, the datanodes take more than 5 minutes to be ready. The
> safe mode time is really really long. Moreover, the program also runs
> much slower than it did on old physical cluster nodes. I have tried
> running hadoop on a cluster containing 15 VM nodes, also on a pesudo
> cluster on a single VM, all very slow. Is it because NAS is an IO
> bottleneck? The HDFS is created on top of glusterfs like reinventing the
> wheel, so I tried to adjust the replication setting to different values
> (1 to 4) but no improvement. I haven't tried CDH3 package yet.

Why use hdfs at all? If it's a shared fs, use file:// URLs

 > I wonder
> whether switching to CDH3 would bring any significant improvement.

It won't

View raw message