Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 45EC8200B4F for ; Mon, 11 Jul 2016 11:23:13 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 449A1160A78; Mon, 11 Jul 2016 09:23:13 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 83CB4160A86 for ; Mon, 11 Jul 2016 11:23:12 +0200 (CEST) Received: (qmail 22994 invoked by uid 500); 11 Jul 2016 09:23:11 -0000 Mailing-List: contact issues-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list issues@ambari.apache.org Received: (qmail 22506 invoked by uid 99); 11 Jul 2016 09:23:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jul 2016 09:23:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 443A92C02AE for ; Mon, 11 Jul 2016 09:23:11 +0000 (UTC) Date: Mon, 11 Jul 2016 09:23:11 +0000 (UTC) From: "Andrew Onischuk (JIRA)" To: issues@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (AMBARI-17646) Nodemanager is not started after installation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 11 Jul 2016 09:23:13 -0000 [ https://issues.apache.org/jira/browse/AMBARI-17646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Onischuk updated AMBARI-17646: ------------------------------------- Status: Patch Available (was: Open) > Nodemanager is not started after installation > --------------------------------------------- > > Key: AMBARI-17646 > URL: https://issues.apache.org/jira/browse/AMBARI-17646 > Project: Ambari > Issue Type: Bug > Reporter: Andrew Onischuk > Assignee: Andrew Onischuk > Fix For: 2.4.0 > > Attachments: AMBARI-17646.patch > > > Nodemanager is down on one of the nodes after installation. This has impacted > most of the splits in todays run (ambari-2.4.0.0-817). > Nodemanager is found be down on one of the nodes in 3 node cluster and its > running on other two nodes. > Live cluster is available here and is alive for > another 24hrs > Below error is seen in nodemanager.log : > 2016-07-10 04:40:59,678 INFO recovery.NMLeveldbStateStoreService > (NMLeveldbStateStoreService.java:checkVersion(1022)) - Loaded NM state version > info 1.0 > 2016-07-10 04:40:59,889 WARN nodemanager.LinuxContainerExecutor > (LinuxContainerExecutor.java:init(195)) - Exit code from container executor > initialization is : 24 > ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must be owned by > root, but is owned by 2530 > at org.apache.hadoop.util.Shell.runCommand(Shell.java:576) > at org.apache.hadoop.util.Shell.run(Shell.java:487) > at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753) > at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux > ContainerExecutor.java:192) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag > er.java:236) > at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag > er(NodeManager.java:547) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java > :595) > 2016-07-10 04:40:59,893 INFO nodemanager.ContainerExecutor > (ContainerExecutor.java:logOutput(322)) - > 2016-07-10 04:40:59,893 INFO service.AbstractService > (AbstractService.java:noteFailure(272)) - Service NodeManager failed in state > INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed > to initialize container executor > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize > container executor > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag > er.java:238) > at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag > er(NodeManager.java:547) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java > :595) > Caused by: java.io.IOException: Linux container executor not configured > properly (error=24) > at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux > ContainerExecutor.java:198) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag > er.java:236) > ... 3 more > Caused by: ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must > be owned by root, but is owned by 2530 > at org.apache.hadoop.util.Shell.runCommand(Shell.java:576) > at org.apache.hadoop.util.Shell.run(Shell.java:487) > at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753) > at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux > ContainerExecutor.java:192) > ... 4 more > 2016-07-10 04:40:59,895 FATAL nodemanager.NodeManager > (NodeManager.java:initAndStartNodeManager(550)) - Error starting NodeManager > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize > container executor > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag > er.java:238) > at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag > er(NodeManager.java:547) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java > :595) > Caused by: java.io.IOException: Linux container executor not configured > properly (error=24) > at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux > ContainerExecutor.java:198) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag > er.java:236) > ... 3 more > Caused by: ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must > be owned by root, but is owned by 2530 > at org.apache.hadoop.util.Shell.runCommand(Shell.java:576) > at org.apache.hadoop.util.Shell.run(Shell.java:487) > at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753) > at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux > ContainerExecutor.java:192) > ... 4 more > 2016-07-10 04:40:59,898 INFO nodemanager.NodeManager > (LogAdapter.java:info(45)) - SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down NodeManager at nat-d7-xals-ambarieu- > newamb-242-1-1/172.22.66.62 -- This message was sent by Atlassian JIRA (v6.3.4#6332)