Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A7EFD25C for ; Mon, 3 Dec 2012 04:28:28 +0000 (UTC) Received: (qmail 81651 invoked by uid 500); 3 Dec 2012 04:28:23 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 81555 invoked by uid 500); 3 Dec 2012 04:28:22 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 81545 invoked by uid 99); 3 Dec 2012 04:28:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Dec 2012 04:28:22 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Sean.McNamara@webtrends.com designates 216.64.169.23 as permitted sender) Received: from [216.64.169.23] (HELO PDXSMTP02.WebTrends.dmz) (216.64.169.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Dec 2012 04:28:18 +0000 Received: from PDXEXMAIL01.webtrends.corp (Not Verified[10.61.2.16]) by PDXSMTP02.WebTrends.dmz with MailMarshal (v6,8,4,9558) (using TLS: SSLv23) id ; Mon, 03 Dec 2012 04:27:57 +0000 Received: from PDXEXMAIL02.WebTrends.corp ([169.254.4.41]) by PDXEXMAIL01.webtrends.corp ([169.254.3.222]) with mapi id 14.02.0318.001; Mon, 3 Dec 2012 04:27:57 +0000 From: Sean McNamara To: "user@hadoop.apache.org" Subject: Re: TaskTracker slow to start/join Thread-Topic: TaskTracker slow to start/join Thread-Index: AQHN0PdBsuDaz7MnHE6E0DzmTCUtMZgGen4A//+LWgA= Date: Mon, 3 Dec 2012 04:27:56 +0000 Message-ID: <012039977044474D9CFA6A36A4D1FF66F178FE@PDXEXMAIL02.webtrends.corp> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.61.2.4] Content-Type: text/plain; charset="Windows-1252" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Ahh that makes perfect sense. Thank you! On 12/2/12 9:25 PM, "Harsh J" wrote: >Hi, > >This is cause of the TT's behavior of deleting the mapred.local.dir >contents every time you restart it. In your version, 1.0.3, that >process is synchronous and hence it appears like the TT hangs when >there's a lot of data to purge out from those dirs. > >On Mon, Dec 3, 2012 at 7:11 AM, Sean McNamara > wrote: >> I have a TaskTracker on a particular node that is very slow to join the >> jobtracker. When I start it up with ./hadoop-daemon.sh start >>tasktracker I >> see the daemon fire up and running in top. The TaskTracker daemon will >>sit >> there using 50% cpu according to top. If it helps any this cluster is >>on >> hadoop 1.0.3. Does anyone know what the TT could be up to? >> >> Here is the log output: >> >> 2012-12-03 01:28:17,310 INFO >>org.apache.hadoop.metrics2.impl.MetricsConfig: >> loaded properties from hadoop-metrics2.properties >> 2012-12-03 01:28:17,320 INFO >> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source >> MetricsSystem,sub=3DStats registered. >> 2012-12-03 01:28:17,321 INFO >> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot >>period >> at 10 second(s). >> 2012-12-03 01:28:17,321 INFO >> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics >> system started >> 2012-12-03 01:28:17,515 INFO >> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source >>ugi >> registered. >> 2012-12-03 01:28:17,679 INFO org.mortbay.log: Logging to >> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via >> org.mortbay.log.Slf4jLog >> 2012-12-03 01:28:17,726 INFO org.apache.hadoop.http.HttpServer: Added >>global >> filtersafety=20 >>(class=3Dorg.apache.hadoop.http.HttpServer$QuotingInputFilter) >> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added >>filter >> static_user_filter >> (class=3Dorg.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter= ) >>to >> context WepAppsContext >> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added >>filter >> static_user_filter >> (class=3Dorg.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter= ) >>to >> context static >> 2012-12-03 01:28:17,728 INFO org.apache.hadoop.http.HttpServer: Added >>filter >> static_user_filter >> (class=3Dorg.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter= ) >>to >> context logs >> 2012-12-03 01:28:17,747 INFO org.apache.hadoop.mapred.TaskLogsTruncater: >> Initializing logs' truncater with mapRetainSize=3D-1 and >>reduceRetainSize=3D-1 >> 2012-12-03 01:28:17,751 INFO org.apache.hadoop.mapred.TaskTracker: >>Starting >> tasktracker with owner as hadoop >> 2012-12-03 01:28:17,752 INFO org.apache.hadoop.mapred.TaskTracker: Good >> mapred local directories are: >>=20 >>/mnt/hdfs/data1/mapred,/mnt/hdfs/data2/mapred,/mnt/hdfs/data3/mapred,/mnt >>/hdfs/data4/mapred,=8A >> >> It takes 10+ minutes to get past this last line, and then it finally >> continues on and registers in ok with the JT. >> >> >> Thanks > > > >--=20 >Harsh J