From yarn-issues-return-168495-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Tue Jun 4 06:51:02 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 83C69180763 for ; Tue, 4 Jun 2019 08:51:02 +0200 (CEST) Received: (qmail 942 invoked by uid 500); 4 Jun 2019 06:51:01 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 722 invoked by uid 99); 4 Jun 2019 06:51:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jun 2019 06:51:01 +0000 Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 07C5AE2C77 for ; Tue, 4 Jun 2019 06:51:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 326402459D for ; Tue, 4 Jun 2019 06:51:00 +0000 (UTC) Date: Tue, 4 Jun 2019 06:51:00 +0000 (UTC) From: "Peter Bacsko (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (YARN-9595) FPGA plugin: NullPointerException in FpgaNodeResourceUpdateHandler.updateConfiguredResource() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855396#comment-16855396 ] Peter Bacsko edited comment on YARN-9595 at 6/4/19 6:50 AM: ------------------------------------------------------------ Thanks for committing this quickly [~tangzhankun] was (Author: pbacsko): Thanks for commit this quickly [~tangzhankun] > FPGA plugin: NullPointerException in FpgaNodeResourceUpdateHandler.updateConfiguredResource() > --------------------------------------------------------------------------------------------- > > Key: YARN-9595 > URL: https://issues.apache.org/jira/browse/YARN-9595 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Priority: Major > Attachments: YARN-9595-001.patch > > > YARN-9264 accidentally introduced a bug in FpgaDiscoverer. Sometimes {{currentFpgaInfo}} is not set, resulting in an NPE being thrown: > {noformat} > 2019-06-03 05:14:50,157 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state INITED; cause: java.lang.NullPointerException > java.lang.NullPointerException > at org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaNodeResourceUpdateHandler.updateConfiguredResource(FpgaNodeResourceUpdateHandler.java:54) > at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.updateConfiguredResourcesViaPlugins(NodeStatusUpdaterImpl.java:358) > at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceInit(NodeStatusUpdaterImpl.java:190) > at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:459) > at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:869) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:942) > {noformat} > The problem is that in {{FpgaDiscoverer}}, we don't set {{currentFpgaInfo}} if the following condition is true: > {noformat} > if (allowed == null || allowed.equalsIgnoreCase( > YarnConfiguration.AUTOMATICALLY_DISCOVER_GPU_DEVICES)) { > return list; > } else if (allowed.matches("(\\d,)*\\d")){ > ... > {noformat} > Solution is simple: initialize it in both code-paths. > Unit tests should be enhanced to verify that it's set properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org