hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: HDFS on non-identical nodes
Date Thu, 12 Feb 2009 15:06:41 GMT

On Feb 12, 2009, at 2:54 AM, Deepak wrote:

> Hi,
> We're running Hadoop cluster on 4 nodes, our primary purpose of
> running is to provide distributed storage solution for internal
> applications here in TellyTopia Inc.
> Our cluster consists of non-identical nodes (one with 1TB another two
> with 3 TB and one more with 60GB) while copying data on HDFS we
> noticed that node with 60GB storage ran out of disk-space and even
> balancer couldn't balance because cluster was stopped. Now my
> questions are
> 1. Is Hadoop is suitable for non-identical cluster nodes?

Yes.  Our cluster has between 60GB and 40TB on our nodes.  The  
majority have around 3TB.

> 2. Is there any way to automatically balancing of nodes?

We have a cron script which automatically starts the Balancer.  It's  
dirty, but it works.

> 3. Why Hadoop cluster stops when one node ran our of disk?

That's not normal.  Trust me, if that was always true, we'd be  
perpetually screwed :)

There might be some other underlying error you're missing...


> Any futher inputs are appericiapted!
> Cheers,
> Deepak
> TellyTopia Inc.

View raw message