Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CEBB618788 for ; Sat, 30 Apr 2016 12:56:42 +0000 (UTC) Received: (qmail 39518 invoked by uid 500); 30 Apr 2016 12:56:36 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 39394 invoked by uid 500); 30 Apr 2016 12:56:36 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 39382 invoked by uid 99); 30 Apr 2016 12:56:36 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Apr 2016 12:56:36 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id E43C01A058C for ; Sat, 30 Apr 2016 12:56:35 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.552 X-Spam-Level: X-Spam-Status: No, score=-0.552 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id pkk8E4Mw_Ms3 for ; Sat, 30 Apr 2016 12:56:35 +0000 (UTC) Received: from mail-pf0-f173.google.com (mail-pf0-f173.google.com [209.85.192.173]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id D49735F23C for ; Sat, 30 Apr 2016 12:56:34 +0000 (UTC) Received: by mail-pf0-f173.google.com with SMTP id 206so59882277pfu.0 for ; Sat, 30 Apr 2016 05:56:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-transfer-encoding:subject:message-id:date:to :mime-version; bh=CmHAC3bb083wZkZrv7Wj5jauKSAcX0yormimcZHkdPY=; b=URQIpqafOJTTyPV2e3v21PtGeq17DZi4UVAyO8JtfSsAS6MeVk8nHDZwcQAH9x7eK1 i+UGazbWtR3yArOkxauu/fphoL33HJzSRQrABvRp7s+NOPZutXmRWiH5jBIOlnx+Wp9o CC0RDz7K4bsjAmDbUPmn0iNV+t7WptBipeOxdchd+qVcTZxR5SFmSFbyg6hZyo2AwjSw giQWTIgh1P07G0wIWE60n5M45Dhgwzud4Hjj0nz4hGFI5+bSWeo+kIciQBbIxo9wi7m9 SK1tfwm2YjGqUgHykILjhU0ROMR18DEpyPJBH8ZkjypirF3tbRLNahf+rqI6AIVEEksM Uumg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-transfer-encoding:subject :message-id:date:to:mime-version; bh=CmHAC3bb083wZkZrv7Wj5jauKSAcX0yormimcZHkdPY=; b=emHCafO/7XgDzyCc6CCy2GYxZV1zJKvxA65hsSFc+tkxAiSXC3y81SiGHgSQJWOARe gI+IwajUAaRqiRYT8gVytqk+fE8qZa3IhVKOme3IdN6qRd4Xc9fjPMlrj4QdJyKdH47+ hOuWvp5JDlYkJBs2+wIUBKpaOtL8wQc009zxjeJ0fsrGLRnUJVIHnohG+YtOATnsYFFI 8lE4rkn0lsS1W+BeZbrsUV9FSBJwR5djKVmZASO50E9aCjypDzwooHcHeOJ6HbSimcT1 HEM7heUQAICGI/YO4RtyeqJBtXAkP3o7V36U13MGs8zafH8Z7pJBkP6yWqCCt5jk3ZLw 5Wrg== X-Gm-Message-State: AOPr4FXtfhWQxd4uYezGYTiBw3PdaNG/fVPX5U3P2GdiJMv5JjgqZOWPxWjvWR9uigDOfg== X-Received: by 10.98.32.211 with SMTP id m80mr36757700pfj.3.1462020993588; Sat, 30 Apr 2016 05:56:33 -0700 (PDT) Received: from [10.0.0.8] ([162.211.149.104]) by smtp.gmail.com with ESMTPSA id t85sm31262250pfi.55.2016.04.30.05.56.32 for (version=TLSv1/SSLv3 cipher=OTHER); Sat, 30 Apr 2016 05:56:32 -0700 (PDT) From: Benjamin Kim Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Completed Tasks in YARN will not release resources Message-Id: <51BDA3B8-3B71-4D1C-853A-2ED41626F157@gmail.com> Date: Sat, 30 Apr 2016 05:56:32 -0700 To: user@hadoop.apache.org Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) Has anyone encountered this problem with YARN. It all started after an = attempt to upgrade from CDH 5.4.8 to CDH 5.5.2. I ran jobs overnight, and they never completed. But, it did take down = the YARN ResourceManager and multiple NodeManagers after 5 or 6 hours. = There was one job that out of 450 mappers, only 64 completed, 386 = pending, and 0 running. The pending mappers are in a Scheduled state. Each data node has 24 cores, 64 GB, 6 drives x 2 TB. NodeManager is = allocated 45GB; Mappers are allocated 4GB (3.2GB Heap); Reducers are = allocated 8GB (6.4GB Heap); AM is allocated 8GB (6.4 GB Heap). There was a note for each completed task attempt stating that it was in = the finished state for too long. I believe that the task is not = releasing the container when it=E2=80=99s done or is not communicating = it back. Does anyone have any ideas? Thanks, Ben --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org For additional commands, e-mail: user-help@hadoop.apache.org