hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom White <...@cloudera.com>
Subject Re: HDFS Safemode and EC2 EBS?
Date Thu, 25 Jun 2009 15:09:26 GMT
Hi Chris,

You should really start all the slave nodes to be sure that you don't
lose data. If you start fewer than #nodes - #replication + 1 nodes
then you are virtually guaranteed to lose blocks. Starting 6 nodes out
of 10 will cause the filesystem to remain in safe mode, as you've

BTW I'm just created a Jira for EBS support
(https://issues.apache.org/jira/browse/HADOOP-6108) which you might be
interested in.


On Thu, Jun 25, 2009 at 3:51 PM, Chris Curtin<curtin.chris@gmail.com> wrote:
> Hi,
> I am using 0.19.0 on EC2. The Hadoop execution and HDFS directories are on
> EBS volumes mounted to each node in my EC2 cluster. Only the install of
> hadoop is in the AMI. We have 10 EBS volumes and when the cluster starts it
> randomly picks one for each slave. We don't always start all 10 slaves
> depending on what type of work we are going to do.
> Every third or fourth start of the cluster the namenode goes into safemode
> and won't come out automatically. Restarting datanodes and task trackers on
> each of the slaves doesn't help. Not much in the log files besides the error
> about waiting for the available %. Forcing it out of safe mode allows the
> cluster to start working.
> My only thought is that something is being stored on one of the EBS volumes
> not being mounted when starting a smaller configuration (say 6 nodes instead
> of 10). But isn't HDFS fault tolerant so that if there is a missing node it
> carries on?
> Any advice on why the namenode and datanodes can't find all the data blocks?
> Or where to look for more information about what might be going on?
> Thanks,
> Chris

View raw message