Return-Path: X-Original-To: apmail-flink-dev-archive@www.apache.org Delivered-To: apmail-flink-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 51911181C8 for ; Tue, 1 Sep 2015 09:18:46 +0000 (UTC) Received: (qmail 81213 invoked by uid 500); 1 Sep 2015 09:18:46 -0000 Delivered-To: apmail-flink-dev-archive@flink.apache.org Received: (qmail 80992 invoked by uid 500); 1 Sep 2015 09:18:46 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 80912 invoked by uid 99); 1 Sep 2015 09:18:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Sep 2015 09:18:46 +0000 Date: Tue, 1 Sep 2015 09:18:46 +0000 (UTC) From: "Robert Metzger (JIRA)" To: dev@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (FLINK-2601) IOManagerAsync may produce NPE during shutdown MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Robert Metzger created FLINK-2601: ------------------------------------- Summary: IOManagerAsync may produce NPE during shutdown Key: FLINK-2601 URL: https://issues.apache.org/jira/browse/FLINK-2601 Project: Flink Issue Type: Bug Components: Tests Affects Versions: 0.10 Reporter: Robert Metzger Priority: Minor While analyzing a failed YARN test, I detected that it failed because it found the following exception in the logs: taskmanager-stderr: {code} Exception in thread "I/O manager shutdown hook" java.lang.NullPointerException at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.shutdown(IOManagerAsync.java:144) at org.apache.flink.runtime.io.disk.iomanager.IOManager$1.run(IOManager.java:103) {code} taskmanager.log {code} 18:45:00,812 INFO org.apache.flink.runtime.taskmanager.TaskManager - Starting TaskManager actor 18:45:00,819 INFO org.apache.flink.runtime.io.network.netty.NettyConfig - NettyConfig [server address: testing-worker-linux-docker-56ee9bbf-3203-linux-2.prod.travis-ci.org/172.17.9.129, server port: 38689, memory segment size (bytes): 32768, transport type: NIO, number of server threads: 0 (use Netty's default), number of client threads: 0 (use Netty's default), server connect backlog: 0 (use Netty's default), client connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's default)] 18:45:00,822 INFO org.apache.flink.runtime.taskmanager.TaskManager - Messages between TaskManager and JobManager have a max timeout of 100000 milliseconds 18:45:00,825 INFO org.apache.flink.runtime.taskmanager.TaskManager - Temporary file directory '/home/travis/build/rmetzger/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-localDir-nm-1_0/usercache/travis/appcache/application_1441046584836_0007': total 15 GB, usable 7 GB (46.67% usable) 18:45:00,929 INFO org.apache.flink.runtime.io.network.buffer.NetworkBufferPool - Allocated 64 MB for network buffer pool (number of memory segments: 2048, bytes per segment: 32768). 18:45:01,186 INFO org.apache.flink.runtime.taskmanager.TaskManager - Using 0.7 of the currently free heap space for Flink managed memory (236 MB). 18:45:01,755 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager - I/O manager uses directory /home/travis/build/rmetzger/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-localDir-nm-1_0/usercache/travis/appcache/application_1441046584836_0007/flink-io-1befed3c-89c5-4b5e-9043-1b92c4c047d4 for spill files. 18:45:01,831 ERROR org.apache.flink.yarn.appMaster.YarnTaskManagerRunner - RECEIVED SIGNAL 15: SIGTERM 18:45:01,833 ERROR org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync - Error while shutting down IO Manager reader thread. java.lang.NullPointerException at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.shutdown(IOManagerAsync.java:133) at org.apache.flink.runtime.io.disk.iomanager.IOManager$1.run(IOManager.java:103) 18:45:01,841 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager - I/O manager removed spill file directory /home/travis/build/rmetzger/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-localDir-nm-1_0/usercache/travis/appcache/application_1441046584836_0007/flink-io-1befed3c-89c5-4b5e-9043-1b92c4c047d4 {code} Looks like the TM is shutting down while still starting up. Hardening this should be easy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)