hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Limotte <mlimo...@feeva.com>
Subject RE: Advice on restarting HDFS in a cron
Date Fri, 24 Apr 2009 18:18:22 GMT
Actually, I'm concerned about performance of map/reduce jobs for a long-running cluster.  I.e.
it seems to get slower the longer it's running.  After a restart of HDFS, the jobs seems to
run faster.  Not concerned about the start-up time of HDFS.

Of course, as you suggest, this could be poor configuration of the cluster on my part; but
I'd still like to hear best practices around doing a scheduled restart.

Marc

-----Original Message-----
From: Allen Wittenauer [mailto:aw@yahoo-inc.com]
Sent: Friday, April 24, 2009 10:17 AM
To: core-user@hadoop.apache.org
Subject: Re: Advice on restarting HDFS in a cron




On 4/24/09 9:31 AM, "Marc Limotte" <mlimotte@feeva.com> wrote:
> I've heard that HDFS starts to slow down after it's been running for a long
> time.  And I believe I've experienced this.

We did an upgrade (== complete restart) of a 2000 node instance in ~20
minutes on Wednesday. I wouldn't really consider that 'slow', but YMMV.

I suspect people aren't running the secondary name node and therefore have
massively large edits file.  The name node appears slow on restart because
it has to apply the edits to the fsimage rather than having the secondary
keep it up to date.


-----Original Message-----
From: Marc Limotte

Hi.

I've heard that HDFS starts to slow down after it's been running for a long time.  And I believe
I've experienced this.   So, I was thinking to set up a cron job to execute every week to
shutdown HDFS and start it up again.

In concept, it would be something like:

0 0 0 0 0 $HADOOP_HOME/bin/stop-dfs.sh; $HADOOP_HOME/bin/start-dfs.sh

But I'm wondering if there is a safer way to do this.  In particular:

*         What if a map/reduce job is running when this cron hits.  Is there a way to suspend
jobs while the HDFS restart happens?

*         Should I also restart the mapred daemons?

*         Should I wait some time after "stop-dfs.sh" for things to settle down, before executing
"start-dfs.sh"?  Or maybe I should run a command to verify that it is stopped before I run
the start?

Thanks for any help.
Marc


PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR ONLY THE INTENDED
RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGE BY LAW. IF YOU RECEIVED
THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL
IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE
DELETE THIS MESSAGE FROM YOUR SYSTEM.

Mime
View raw message