hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sridhar basam <...@basam.org>
Subject Re: hadoop becomes very slow in Eucalyptus VMs
Date Mon, 29 Aug 2011 16:17:43 GMT
On Mon, Aug 29, 2011 at 11:32 AM, Shi Yu <shiyu@uchicago.edu> wrote:

> I installed hadoop-0.20.2 in Eucalyptus VM environment. The file system is
> based on glusterfs, so it is a shared NAS.  Though the nodes are much
> powerful (8 cores + 15G memory), I found the response of hadoop namenode and
> data nodes became very slow. For example, after running start-all.sh,  the
> datanodes take more than 5 minutes to be ready.  The safe mode time is
> really really long.  Moreover, the program also runs much slower than it did
> on old physical cluster nodes.  I have tried running hadoop on a cluster
> containing 15 VM nodes, also on a pesudo cluster on a single VM, all very
> slow.  Is it because NAS is an IO bottleneck?  The HDFS is created on top of
> glusterfs like reinventing the wheel, so I tried to adjust the replication
> setting to different values (1 to 4) but no improvement.  I haven't tried
> CDH3 package yet. I wonder whether switching to CDH3 would bring any
> significant improvement.   Any suggestion about this issue is highly
> appreciated.
> Shi
Your problems are likely due to your setup (VMs and your NAS filesystem).
Without additional information it would be hard to say where the problem is
but installing CDH3 isn't going to fix your performance issues. It is based
on the apache distribution along with a few additional patches.

You are better off running hadoop on physical hardware with local storage.
If you want to narrow down the problem, start with one change at a time.
Looks like you already had/have a cluster on physical hardware. Bring up a
cluster on just VM hardware without Gluster. Time whatever benchmark you are
using, then introduce another change and repeat process.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message