Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 06AE3185AA for ; Fri, 18 Dec 2015 15:20:47 +0000 (UTC) Received: (qmail 35703 invoked by uid 500); 18 Dec 2015 15:20:46 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 35624 invoked by uid 500); 18 Dec 2015 15:20:46 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 35554 invoked by uid 99); 18 Dec 2015 15:20:46 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Dec 2015 15:20:46 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A0EF62C1F6C for ; Fri, 18 Dec 2015 15:20:46 +0000 (UTC) Date: Fri, 18 Dec 2015 15:20:46 +0000 (UTC) From: "Wei-Chiu Chuang (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-4467) Shell.checkIsBashSupported swallowed an interrupted exception MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated YARN-4467: ---------------------------------- Description: Shell.checkIsBashSupported() creates a bash shell command to verify if the system supports bash. However, its error message is misleading, and the logic should be updated. If the shell command throws an IOException, it does not imply the bash did not run successfully. If the shell command process was interrupted, its internal logic throws an InterruptedIOException, which is a subclass of IOException. {code:title=Shell.checkIsBashSupported|borderStyle=solid} ShellCommandExecutor shexec; boolean supported = true; try { String[] args = {"bash", "-c", "echo 1000"}; shexec = new ShellCommandExecutor(args); shexec.execute(); } catch (IOException ioe) { LOG.warn("Bash is not supported by the OS", ioe); supported = false; } {code} An example of it appeared in a recent jenkins job https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/ The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a thread, wait it for 1 second, and interrupt the thread, expecting the thread to terminate. However, the method Shell.checkIsBashSupported swallowed the interrupt, and therefore failed. {noformat} 2015-12-16 21:31:53,797 WARN util.Shell (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS java.io.InterruptedIOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:930) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716) at org.apache.hadoop.util.Shell.(Shell.java:705) at org.apache.hadoop.util.StringUtils.(StringUtils.java:79) at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646) at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397) at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330) at org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:503) at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264) at org.apache.hadoop.util.Shell.runCommand(Shell.java:920) ... 15 more {noformat} The original design is not desirable, as it swallowed a potential interrupt, causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. Unfortunately, Java does not allow this static method to throw exception. We should removed the static member variable, so that the method can throw the interrupt exception. The node manager should call the static method, instead of using the static member variable. This fix has an associated benefit: the tests could run faster, because it will no longer need to spawn a bash process when it uses a Shell static method variable (which happens quite often for checking what operating system Hadoop is running on) was: Shell.checkIsBashSupported() creates a bash shell command to verify if the system supports bash. However, its error message is misleading, and the logic should be updated. If the shell command throws an IOException, it does not imply the bash did not run successfully. If the shell command process was interrupted, its internal logic throws an InterruptedIOException, which is a subclass of IOException. {code:title=Shell.checkIsBashSupported|borderStyle=solid} ShellCommandExecutor shexec; boolean supported = true; try { String[] args = {"bash", "-c", "echo 1000"}; shexec = new ShellCommandExecutor(args); shexec.execute(); } catch (IOException ioe) { LOG.warn("Bash is not supported by the OS", ioe); supported = false; } {code} An example of it appeared in a recent jenkins job https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/ The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a thread, wait it for 1 second, and interrupt the thread, expecting the thread to terminate. However, the method Shell.checkIsBashSupported swallowed the interrupt, and therefore failed. {noformat} 2015-12-16 21:31:53,797 WARN util.Shell (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS java.io.InterruptedIOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:930) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716) at org.apache.hadoop.util.Shell.(Shell.java:705) at org.apache.hadoop.util.StringUtils.(StringUtils.java:79) at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646) at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397) at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330) at org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:503) at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264) at org.apache.hadoop.util.Shell.runCommand(Shell.java:920) ... 15 more {noformat} The original design is not desirable, as it swallowed a potential interrupt, causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. Unfortunately, Java does not allow this static method to throw exception. We should removed the static member variable, so that the method can throw the interrupt exception. The node manager should call the static method, instead of using the static member variable. > Shell.checkIsBashSupported swallowed an interrupted exception > ------------------------------------------------------------- > > Key: YARN-4467 > URL: https://issues.apache.org/jira/browse/YARN-4467 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Reporter: Wei-Chiu Chuang > Labels: shell, supportability > Attachments: HADOOP-12652.001.patch, YARN-4467.001.patch > > > Shell.checkIsBashSupported() creates a bash shell command to verify if the system supports bash. However, its error message is misleading, and the logic should be updated. > If the shell command throws an IOException, it does not imply the bash did not run successfully. If the shell command process was interrupted, its internal logic throws an InterruptedIOException, which is a subclass of IOException. > {code:title=Shell.checkIsBashSupported|borderStyle=solid} > ShellCommandExecutor shexec; > boolean supported = true; > try { > String[] args = {"bash", "-c", "echo 1000"}; > shexec = new ShellCommandExecutor(args); > shexec.execute(); > } catch (IOException ioe) { > LOG.warn("Bash is not supported by the OS", ioe); > supported = false; > } > {code} > An example of it appeared in a recent jenkins job > https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/ > The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a thread, wait it for 1 second, and interrupt the thread, expecting the thread to terminate. However, the method Shell.checkIsBashSupported swallowed the interrupt, and therefore failed. > {noformat} > 2015-12-16 21:31:53,797 WARN util.Shell (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS > java.io.InterruptedIOException: java.lang.InterruptedException > at org.apache.hadoop.util.Shell.runCommand(Shell.java:930) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716) > at org.apache.hadoop.util.Shell.(Shell.java:705) > at org.apache.hadoop.util.StringUtils.(StringUtils.java:79) > at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639) > at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273) > at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261) > at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803) > at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773) > at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646) > at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397) > at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330) > at org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115) > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264) > at org.apache.hadoop.util.Shell.runCommand(Shell.java:920) > ... 15 more > {noformat} > The original design is not desirable, as it swallowed a potential interrupt, causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. Unfortunately, Java does not allow this static method to throw exception. We should removed the static member variable, so that the method can throw the interrupt exception. The node manager should call the static method, instead of using the static member variable. > This fix has an associated benefit: the tests could run faster, because it will no longer need to spawn a bash process when it uses a Shell static method variable (which happens quite often for checking what operating system Hadoop is running on) -- This message was sent by Atlassian JIRA (v6.3.4#6332)