Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 263AD200C85 for ; Tue, 30 May 2017 17:14:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 232D4160BDC; Tue, 30 May 2017 15:14:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 43E8F160BC9 for ; Tue, 30 May 2017 17:14:08 +0200 (CEST) Received: (qmail 3741 invoked by uid 500); 30 May 2017 15:14:07 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 3730 invoked by uid 99); 30 May 2017 15:14:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 May 2017 15:14:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 0188D1A7A42 for ; Tue, 30 May 2017 15:14:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id ZHvflY7iXC2N for ; Tue, 30 May 2017 15:14:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 7BDE15FD8E for ; Tue, 30 May 2017 15:14:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id EDC3BE0D6A for ; Tue, 30 May 2017 15:14:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 34EB021B5C for ; Tue, 30 May 2017 15:14:04 +0000 (UTC) Date: Tue, 30 May 2017 15:14:04 +0000 (UTC) From: "John Zhuge (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HADOOP-13770) Shell.checkIsBashSupported swallowed an interrupted exception MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 30 May 2017 15:14:09 -0000 [ https://issues.apache.org/jira/browse/HADOOP-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HADOOP-13770: -------------------------------- Priority: Minor (was: Blocker) > Shell.checkIsBashSupported swallowed an interrupted exception > ------------------------------------------------------------- > > Key: HADOOP-13770 > URL: https://issues.apache.org/jira/browse/HADOOP-13770 > Project: Hadoop Common > Issue Type: Bug > Components: util > Reporter: Wei-Chiu Chuang > Assignee: Wei-Chiu Chuang > Priority: Minor > Labels: oct16-easy, shell, supportability > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HADOOP-12652.001.patch, YARN-4467.001.patch > > > Shell.checkIsBashSupported() creates a bash shell command to verify if the system supports bash. However, its error message is misleading, and the logic should be updated. > If the shell command throws an IOException, it does not imply the bash did not run successfully. If the shell command process was interrupted, its internal logic throws an InterruptedIOException, which is a subclass of IOException. > {code:title=Shell.checkIsBashSupported|borderStyle=solid} > ShellCommandExecutor shexec; > boolean supported = true; > try { > String[] args = {"bash", "-c", "echo 1000"}; > shexec = new ShellCommandExecutor(args); > shexec.execute(); > } catch (IOException ioe) { > LOG.warn("Bash is not supported by the OS", ioe); > supported = false; > } > {code} > An example of it appeared in a recent jenkins job > https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/ > The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a thread, wait it for 1 second, and interrupt the thread, expecting the thread to terminate. However, the method Shell.checkIsBashSupported swallowed the interrupt, and therefore failed. > {noformat} > 2015-12-16 21:31:53,797 WARN util.Shell (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS > java.io.InterruptedIOException: java.lang.InterruptedException > at org.apache.hadoop.util.Shell.runCommand(Shell.java:930) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716) > at org.apache.hadoop.util.Shell.(Shell.java:705) > at org.apache.hadoop.util.StringUtils.(StringUtils.java:79) > at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639) > at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273) > at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261) > at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803) > at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773) > at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646) > at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397) > at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330) > at org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115) > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264) > at org.apache.hadoop.util.Shell.runCommand(Shell.java:920) > ... 15 more > {noformat} > The original design is not desirable, as it swallowed a potential interrupt, causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. Unfortunately, Java does not allow this static method to throw exception. We should removed the static member variable, so that the method can throw the interrupt exception. The node manager should call the static method, instead of using the static member variable. > This fix has an associated benefit: the tests could run faster, because it will no longer need to spawn a bash process when it uses a Shell static method variable (which happens quite often for checking what operating system Hadoop is running on) -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org