Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 238FA11450 for ; Sun, 20 Apr 2014 14:47:20 +0000 (UTC) Received: (qmail 48874 invoked by uid 500); 20 Apr 2014 14:47:12 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 48760 invoked by uid 500); 20 Apr 2014 14:47:12 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 48753 invoked by uid 99); 20 Apr 2014 14:47:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 20 Apr 2014 14:47:12 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hadoop.ca@gmail.com designates 209.85.160.50 as permitted sender) Received: from [209.85.160.50] (HELO mail-pb0-f50.google.com) (209.85.160.50) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 20 Apr 2014 14:47:07 +0000 Received: by mail-pb0-f50.google.com with SMTP id md12so2951569pbc.9 for ; Sun, 20 Apr 2014 07:46:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=2ICxu4JhqVx5rUwDop82omF2/OioY8e2qYimwCQ5E28=; b=uui00yAANrYf9yaBnPeHWS1bRhYGrQlfYJVR1+FE8arHRUs5kQCjuW37GYaFyQGMmU bk4H+VDvieBPQW+ulNcUcaRM1IFipAVsK+O24wpDcndJcCzrst/P4PXTgGE1hvVMDzHS TZf1dkYyaSVHJDt3QFtwNsNsg3t4uIHGwpH8TrhS1FnLaobGqnibuCSWLdhP6KtXrh3B aDld9RjFuXDtkHkrBiDCE1hJ0MYT3Y6KbxhAbC59sw8si5/ayQDv7WVBwBIt+wxtgFAk GSyvg0R/oPHPWRpt6gAskPD6mZeGRK7uQEXhOeLJVJmyDlBYHo13e1VNNzOHQc4GWQ3k ULIQ== X-Received: by 10.66.180.141 with SMTP id do13mr32879681pac.93.1398005204336; Sun, 20 Apr 2014 07:46:44 -0700 (PDT) Received: from [100.105.233.97] ([172.56.30.236]) by mx.google.com with ESMTPSA id qv9sm65227649pbc.71.2014.04.20.07.46.43 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 20 Apr 2014 07:46:43 -0700 (PDT) Content-Type: multipart/alternative; boundary=Apple-Mail-03DEF328-B012-4795-BAB8-03F96EAB78B5 Mime-Version: 1.0 (1.0) Subject: Re: Stuck Job - how should I troubleshoot? From: Serge Blazhievsky X-Mailer: iPhone Mail (11B651) In-Reply-To: Date: Sun, 20 Apr 2014 07:46:41 -0700 Cc: "" Content-Transfer-Encoding: 7bit Message-Id: <4506E7FD-EAE2-4F8A-8F46-1548C2B4AA94@gmail.com> References: To: "user@hadoop.apache.org" X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-03DEF328-B012-4795-BAB8-03F96EAB78B5 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable It could be a case that some step of the job takes particularly long. Take a= look at counters. If they are changing, job is not stuck just takes long ti= me.=20 Once you know that you could either debug deadlock or apply optimization tec= hniques=20 Serge Sent from my iPhone > On Apr 20, 2014, at 4:12, Clay McDonald w= rote: >=20 > Hello all. Please see the attached screenshot. I have a job that is stuck.= I=E2=80=99ve looked in logs but don=E2=80=99t see anything that jumps out a= t me. How should I trouble shoot this? Thanks, Clay > --Apple-Mail-03DEF328-B012-4795-BAB8-03F96EAB78B5 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
It could be a case that some step of t= he job takes particularly long. Take a look at counters. If they are changin= g, job is not stuck just takes long time. 

Onc= e you know that you could either debug deadlock or apply optimization techni= ques 


Serge

Sent from my= iPhone

On Apr 20, 2014, at 4:12, Clay McDonald <stuart.mcdonald@bateswhite.com&g= t; wrote:

=

Hello all. Please see the attached screensh= ot. I have a job that is stuck. I=E2=80=99ve looked in logs but don=E2=80=99= t see anything that jumps out at me. How should I trouble shoot this? Thanks, Clay

<stuck_mapreduce_job.jp= g>
= --Apple-Mail-03DEF328-B012-4795-BAB8-03F96EAB78B5--