hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: Is it necessary to run secondary namenode when starting HDFS?
Date Mon, 17 Dec 2012 17:12:51 GMT
You don't need a secondary name node.  It creates snapshots of the name
node metadata periodically, which helps to keep down the size of the edits
files.  If you don't run one, over time your edits files will grow.  The
next time you go to restart your namenode, it could take a very long time
to start up if your edits are large.  I recommend running one in
production, to reduce the amount of downtime if you need to replace or
restart your namenode.  If that isn't a concern for you then you don't need
it.


On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <iryndin@gmail.com> wrote:

> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>
>

Mime
View raw message