hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuart Smith <stu24m...@yahoo.com>
Subject keeping an active hdfs cluster balanced
Date Thu, 17 Mar 2011 19:13:44 GMT
Parts of this may end up on the hbase list, but I thought I'd start here. My basic problem

My cluster is getting full enough that having one data node go down does put a bit of pressure
on the system (when balanced, every DN is more than half full).

I write (and delete) pretty actively to Hbase & some hdfs direct.

The cluster keeps drifting dangerously out of balance.

I run the balancer daily, but:

   - I've seen reports that you shouldn't rebalance with regionservers running, yet, I don't
really have a choice. Without HBase, my system is pretty much down. If it gets out of balance,
it will also come down.

  Anybody here have any idea how badly running the balancer on a heavily active system messes
things up? (for hdfs/hbase - if anyone knows).

   - Possibly somewhat related: I'm seeing more "failed to move block" errors in my balancer
logs. It got to the point were I wasn't seeing any effective rebalancing occur. I've turned
off access to the cluster and rebalanced (one node was down to 10% free space, a couple others
when up to 50 or more). I'm back down to around 20-40% free space on each node (as reported
by the hdfs web interface).

    How effective is the balancer on a active cluster? Is there any way to make it's life
easier, so it can stay in balance with daily runs?

I'm not sure why the one node ends up being so heavily favored, either. The favoritism even
seems to survive taking the node down, and bringing it back up. If I can't find the resources
to upgrade, I might try that again, but I'm less than hopeful about it.

Any ideas? Or do I just need better hardware? Not sure if that's an option, though..

Take care,


View raw message