flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jing lining (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-5844) jobmanager was killed when disk less 10% and restart fail
Date Mon, 20 Feb 2017 04:27:44 GMT

     [ https://issues.apache.org/jira/browse/FLINK-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

jing lining updated FLINK-5844:
-------------------------------
    Environment: 


    Description: 
JobManager was killed

提交命令: /bin/flink  run -m yarn-cluster -yn 6 -yjm 1024 -ytm 2048
log is
{quote}
2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard root cache directory /tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard jar upload directory /tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:54513
End of LogType:jobmanager.log
{quote}

then yarn restart new node but always fail

log

{quote}
2017-02-19 03:20:39,166 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - --------------------------------------------------------------------------------
2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Starting YARN ApplicationMaster / JobManager (Version: 1.1.3, Rev:8e8d454, Date:10.10.2016 @ 13:26:32 UTC)
2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Current user: appweb
2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.65-b01
2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Maximum heap size: 1840 MiBytes
2017-02-19 03:20:39,168 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  JAVA_HOME: /data/program/java
2017-02-19 03:20:39,168 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Hadoop version: 2.7.2
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  JVM Options:
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Xmx1920M
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -verbose:gc -Xloggc:/tmp/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=500m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/java.hprof
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Dlog.file=/data/logs/hadoop/containers/application_1482390799413_0053/container_1482390799413_0053_02_000001/jobmanager.log
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Dlogback.configurationFile=file:logback.xml
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Dlog4j.configuration=file:log4j.properties
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Program Arguments: (none)
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Classpath: logback.xml:log4j.properties:lib/slf4j-log4j12-1.7.7.jar:lib/flink-python_2.11-1.1.3.jar:lib/log4j-1.2.17.jar:lib/flink-dist_2.11-1.1.3.jar:flink.jar:flink-conf.yaml::/data/program/hadoop-2.7.2/etc/hadoop:/data/program/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2-tests.jar:/data/program/hadoop-2.7.2/share/hadoop/common/hadoop-nfs-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/gson-2.2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-digester-1.8.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/hamcrest-core-1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/netty-3.6.2.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-cli-1.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jsch-0.1.42.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-lang-2.6.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/curator-client-2.7.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jsp-api-2.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/hadoop-annotations-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/avro-1.7.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/curator-framework-2.7.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jsr305-3.0.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/xmlenc-0.52.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jersey-core-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/guava-11.0.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-codec-1.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/stax-api-1.0-2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-httpclient-3.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/zookeeper-3.4.6.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/xz-1.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/httpcore-4.2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jetty-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/httpclient-4.2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-configuration-1.6.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jersey-server-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/paranamer-2.3.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/log4j-1.2.17.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-compress-1.4.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/hadoop-auth-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jetty-util-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/junit-4.11.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/servlet-api-2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jersey-json-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-net-3.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/activation-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-collections-3.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jettison-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-io-2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/mockito-all-1.8.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/asm-3.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jets3t-0.9.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-2.7.2-tests.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-nfs-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-io-2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/asm-3.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-api-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-common-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-common-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-client-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-registry-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/hbase-protocol-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-cli-1.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-lang-2.6.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/javax.inject-1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-core-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/guava-11.0.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-codec-1.4.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-client-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/xz-1.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jetty-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/guice-3.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-server-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/log4j-1.2.17.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/servlet-api-2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-json-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/activation-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jettison-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-io-2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/asm-3.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar
2017-02-19 03:20:39,170 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - --------------------------------------------------------------------------------
2017-02-19 03:20:39,170 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Registered UNIX signal handlers for [TERM, HUP, INT]
2017-02-19 03:20:39,171 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN daemon runs as user appweb. Running Flink Application Master/JobManager as user appweb
2017-02-19 03:20:39,173 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN assigned hostname for application master: dcp-168-7-hzqsh.node.hzqsh.wacai.sdc
2017-02-19 03:20:39,177 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Loading config from directory /data/hadoop/yarn-local/usercache/appweb/appcache/application_1482390799413_0053/container_1482390799413_0053_02_000001
2017-02-19 03:20:39,189 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - TaskManagers will be created with 1 task slots
2017-02-19 03:20:39,189 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - TaskManagers will be started with container size 4096 MB, JVM heap size 3072 MB, JVM direct memory limit 3072 MB
2017-02-19 03:20:39,203 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Trying to start actor system at 10.1.168.7:56782
2017-02-19 03:20:39,573 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2017-02-19 03:20:39,607 INFO  Remoting                                                      - Starting remoting
2017-02-19 03:20:39,719 INFO  Remoting                                                      - Remoting started; listening on addresses :[akka.tcp://flink@10.1.168.7:56782]
2017-02-19 03:20:39,724 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Actor system started at 10.1.168.7:56782
2017-02-19 03:20:39,724 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Actor system bound to hostname 10.1.168.7.
2017-02-19 03:20:39,727 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Setting up resources for TaskManagers
2017-02-19 03:20:40,294 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/data/hadoop/yarn-local/usercache/appweb/appcache/application_1482390799413_0053/container_1482390799413_0053_02_000001/01fac057-b7aa-4f57-9d78-66d2a792d390-taskmanager-conf.yaml to hdfs://10.1.168.10:9000/user/appweb/.flink/application_1482390799413_0053/01fac057-b7aa-4f57-9d78-66d2a792d390-taskmanager-conf.yaml
2017-02-19 03:20:41,466 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Prepared local resource for modified yaml: resource { scheme: "hdfs" host: "10.1.168.10" port: 9000 file: "/user/appweb/.flink/application_1482390799413_0053/01fac057-b7aa-4f57-9d78-66d2a792d390-taskmanager-conf.yaml" } size: 1003 timestamp: 1487445641035 type: FILE visibility: APPLICATION
2017-02-19 03:20:41,470 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Creating container launch context for TaskManagers
2017-02-19 03:20:41,470 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Starting TaskManagers with command: $JAVA_HOME/bin/java -Xms3072m -Xmx3072m -XX:MaxDirectMemorySize=3072m '-verbose:gc -Xloggc:/tmp/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=500m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/java.hprof' -Dlog.file=<LOG_DIR>/taskmanager.log -Dlogback.configurationFile=file:./logback.xml -Dlog4j.configuration=file:./log4j.properties org.apache.flink.yarn.YarnTaskManager --configDir . 1> <LOG_DIR>/taskmanager.out 2> <LOG_DIR>/taskmanager.err
2017-02-19 03:20:41,493 INFO  org.apache.flink.runtime.blob.BlobServer                      - Created BLOB server storage directory /tmp/blobStore-12a930b7-5967-4576-bc6a-80c3f3d46654
2017-02-19 03:20:41,494 INFO  org.apache.flink.runtime.blob.BlobServer                      - Started BLOB server at 0.0.0.0:31707 - max concurrent requests: 50 - max backlog: 1000
2017-02-19 03:20:41,503 INFO  org.apache.flink.runtime.checkpoint.savepoint.SavepointStoreFactory  - Using job manager savepoint state backend.
2017-02-19 03:20:41,507 INFO  org.apache.flink.runtime.metrics.MetricRegistry               - No metrics reporter configured, no metrics will be exposed/reported.
2017-02-19 03:20:41,510 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Starting JobManager Web Frontend
2017-02-19 03:20:41,514 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started memory archivist akka://flink/user/$a
2017-02-19 03:20:41,515 INFO  org.apache.flink.yarn.YarnJobManager                          - Starting JobManager at akka.tcp://flink@10.1.168.7:56782/user/jobmanager.
2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Determined location of JobManager log file: /data/logs/hadoop/containers/application_1482390799413_0053/container_1482390799413_0053_02_000001/jobmanager.log
2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Determined location of JobManager stdout file: /data/logs/hadoop/containers/application_1482390799413_0053/container_1482390799413_0053_02_000001/jobmanager.out
2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using directory /tmp/flink-web-257b6cc9-8161-470b-8c1e-bb5c971c22c6 for the web interface files
2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using directory /tmp/flink-web-upload-ddd4aa8c-6519-4f48-8af9-33d5ca6d2abd for web frontend JAR file uploads
2017-02-19 03:20:41,582 INFO  org.apache.flink.yarn.YarnJobManager                          - JobManager akka.tcp://flink@10.1.168.7:56782/user/jobmanager was granted leadership with leader session ID None.
2017-02-19 03:20:41,764 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Web frontend listening at 0.0.0.0:23430
2017-02-19 03:20:41,764 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Starting with JobManager akka.tcp://flink@10.1.168.7:56782/user/jobmanager on port 23430
2017-02-19 03:20:41,764 INFO  org.apache.flink.runtime.webmonitor.JobManagerRetriever       - New leader reachable under akka://flink/user/jobmanager#1485650694:null.
2017-02-19 03:20:41,771 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN application tolerates 6 failed TaskManager containers before giving up
2017-02-19 03:20:41,774 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN Application Master started
2017-02-19 03:20:41,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Initializing YARN resource master
2017-02-19 03:20:41,791 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /10.1.168.10:8030
2017-02-19 03:20:41,808 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - yarn.client.max-cached-nodemanagers-proxies : 0
2017-02-19 03:20:41,809 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Registering Application Master with tracking url http://dcp-168-7-hzqsh.node.hzqsh.wacai.sdc:23430
2017-02-19 03:20:41,910 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Trying to associate with JobManager leader akka://flink/user/jobmanager#1485650694
2017-02-19 03:20:41,914 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Resource Manager associating with leading JobManager Actor[akka://flink/user/jobmanager#1485650694] - leader session null
2017-02-19 03:20:41,915 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 1
2017-02-19 03:20:41,922 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 2
2017-02-19 03:20:41,922 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 3
2017-02-19 03:20:41,923 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 4
2017-02-19 03:20:41,923 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 5
2017-02-19 03:20:41,923 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 6
2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:44636
2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:9910
2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:57063
2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:42444
2017-02-19 03:20:42,967 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000002 - Remaining pending container requests: 5
2017-02-19 03:20:42,968 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445642968: Container: [ContainerId: container_1482390799413_0053_02_000002, NodeId: dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:44636, NodeHttpAddress: dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.11:44636 }, ] on host dcp-168-11-hzqsh.node.hzqsh.wacai.sdc
2017-02-19 03:20:42,971 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:44636
2017-02-19 03:20:43,014 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000003 - Remaining pending container requests: 4
2017-02-19 03:20:43,014 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643014: Container: [ContainerId: container_1482390799413_0053_02_000003, NodeId: dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:9910, NodeHttpAddress: dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.3:9910 }, ] on host dcp-168-3-hzqsh.node.hzqsh.wacai.sdc
2017-02-19 03:20:43,015 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:9910
2017-02-19 03:20:43,025 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000004 - Remaining pending container requests: 3
2017-02-19 03:20:43,025 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643025: Container: [ContainerId: container_1482390799413_0053_02_000004, NodeId: dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:57063, NodeHttpAddress: dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.9:57063 }, ] on host dcp-168-9-hzqsh.node.hzqsh.wacai.sdc
2017-02-19 03:20:43,026 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:57063
2017-02-19 03:20:43,043 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000005 - Remaining pending container requests: 2
2017-02-19 03:20:43,043 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643043: Container: [ContainerId: container_1482390799413_0053_02_000005, NodeId: dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:42444, NodeHttpAddress: dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.1:42444 }, ] on host dcp-168-1-hzqsh.node.hzqsh.wacai.sdc
2017-02-19 03:20:43,045 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:42444
2017-02-19 03:20:43,461 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:43466
2017-02-19 03:20:43,461 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:64371
2017-02-19 03:20:43,462 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000006 - Remaining pending container requests: 1
2017-02-19 03:20:43,463 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643462: Container: [ContainerId: container_1482390799413_0053_02_000006, NodeId: dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:43466, NodeHttpAddress: dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.6:43466 }, ] on host dcp-168-6-hzqsh.node.hzqsh.wacai.sdc
2017-02-19 03:20:43,463 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:43466
2017-02-19 03:20:43,482 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000007 - Remaining pending container requests: 0
2017-02-19 03:20:43,482 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643482: Container: [ContainerId: container_1482390799413_0053_02_000007, NodeId: dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:64371, NodeHttpAddress: dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.5:64371 }, ] on host dcp-168-5-hzqsh.node.hzqsh.wacai.sdc
2017-02-19 03:20:43,483 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:64371
2017-02-19 03:20:44,244 WARN  org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler     - Error while handling request
org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job with id 1b45608e30808183913eeffbb4d855da
	at org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:88)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:84)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:105)
	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
	at java.lang.Thread.run(Thread.java:745)
{quote}

  was:
JobManager was killed

log is
{quote}
2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard root cache directory /tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard jar upload directory /tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:54513
End of LogType:jobmanager.log
{quote}

then yarn restart new node but always fail

log

{quote}
2017-02-19 03:20:44,244 WARN  org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler     - Error while handling request
org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job with id 1b45608e30808183913eeffbb4d855da
	at org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:88)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:84)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:105)
	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
	at java.lang.Thread.run(Thread.java:745)
{quote}


> jobmanager was killed when disk less 10% and restart fail
> ---------------------------------------------------------
>
>                 Key: FLINK-5844
>                 URL: https://issues.apache.org/jira/browse/FLINK-5844
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.1.3
>         Environment: 
>            Reporter: jing lining
>
> JobManager was killed
> 提交命令: /bin/flink  run -m yarn-cluster -yn 6 -yjm 1024 -ytm 2048
> log is
> {quote}
> 2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
> 2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
> 2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
> 2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
> 2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard root cache directory /tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
> 2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
> 2017-02-19 03:20:37,137 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard jar upload directory /tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
> 2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:54513
> End of LogType:jobmanager.log
> {quote}
> then yarn restart new node but always fail
> log
> {quote}
> 2017-02-19 03:20:39,166 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - --------------------------------------------------------------------------------
> 2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Starting YARN ApplicationMaster / JobManager (Version: 1.1.3, Rev:8e8d454, Date:10.10.2016 @ 13:26:32 UTC)
> 2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Current user: appweb
> 2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.65-b01
> 2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Maximum heap size: 1840 MiBytes
> 2017-02-19 03:20:39,168 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  JAVA_HOME: /data/program/java
> 2017-02-19 03:20:39,168 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Hadoop version: 2.7.2
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  JVM Options:
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Xmx1920M
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -verbose:gc -Xloggc:/tmp/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=500m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/java.hprof
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Dlog.file=/data/logs/hadoop/containers/application_1482390799413_0053/container_1482390799413_0053_02_000001/jobmanager.log
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Dlogback.configurationFile=file:logback.xml
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -     -Dlog4j.configuration=file:log4j.properties
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Program Arguments: (none)
> 2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             -  Classpath: logback.xml:log4j.properties:lib/slf4j-log4j12-1.7.7.jar:lib/flink-python_2.11-1.1.3.jar:lib/log4j-1.2.17.jar:lib/flink-dist_2.11-1.1.3.jar:flink.jar:flink-conf.yaml::/data/program/hadoop-2.7.2/etc/hadoop:/data/program/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2-tests.jar:/data/program/hadoop-2.7.2/share/hadoop/common/hadoop-nfs-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/gson-2.2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-digester-1.8.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/hamcrest-core-1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/netty-3.6.2.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-cli-1.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jsch-0.1.42.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-lang-2.6.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/curator-client-2.7.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jsp-api-2.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/hadoop-annotations-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/avro-1.7.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/curator-framework-2.7.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jsr305-3.0.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/xmlenc-0.52.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jersey-core-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/guava-11.0.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-codec-1.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/stax-api-1.0-2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-httpclient-3.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/zookeeper-3.4.6.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/xz-1.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/httpcore-4.2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jetty-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/httpclient-4.2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-configuration-1.6.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jersey-server-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/paranamer-2.3.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/log4j-1.2.17.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-compress-1.4.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/hadoop-auth-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jetty-util-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/junit-4.11.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/servlet-api-2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jersey-json-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-net-3.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/activation-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-collections-3.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jettison-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/commons-io-2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/mockito-all-1.8.5.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/asm-3.2.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/jets3t-0.9.0.jar:/data/program/hadoop-2.7.2/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-2.7.2-tests.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-nfs-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-io-2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/asm-3.2.jar:/data/program/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-api-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-common-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-common-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-client-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-registry-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/hbase-protocol-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-cli-1.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-lang-2.6.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/javax.inject-1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-core-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/guava-11.0.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-codec-1.4.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-client-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/xz-1.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jetty-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/guice-3.0.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-server-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/log4j-1.2.17.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/servlet-api-2.5.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-json-1.9.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/activation-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jettison-1.1.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/commons-io-2.4.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/asm-3.2.jar:/data/program/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar
> 2017-02-19 03:20:39,170 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - --------------------------------------------------------------------------------
> 2017-02-19 03:20:39,170 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Registered UNIX signal handlers for [TERM, HUP, INT]
> 2017-02-19 03:20:39,171 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN daemon runs as user appweb. Running Flink Application Master/JobManager as user appweb
> 2017-02-19 03:20:39,173 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN assigned hostname for application master: dcp-168-7-hzqsh.node.hzqsh.wacai.sdc
> 2017-02-19 03:20:39,177 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Loading config from directory /data/hadoop/yarn-local/usercache/appweb/appcache/application_1482390799413_0053/container_1482390799413_0053_02_000001
> 2017-02-19 03:20:39,189 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - TaskManagers will be created with 1 task slots
> 2017-02-19 03:20:39,189 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - TaskManagers will be started with container size 4096 MB, JVM heap size 3072 MB, JVM direct memory limit 3072 MB
> 2017-02-19 03:20:39,203 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Trying to start actor system at 10.1.168.7:56782
> 2017-02-19 03:20:39,573 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
> 2017-02-19 03:20:39,607 INFO  Remoting                                                      - Starting remoting
> 2017-02-19 03:20:39,719 INFO  Remoting                                                      - Remoting started; listening on addresses :[akka.tcp://flink@10.1.168.7:56782]
> 2017-02-19 03:20:39,724 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Actor system started at 10.1.168.7:56782
> 2017-02-19 03:20:39,724 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Actor system bound to hostname 10.1.168.7.
> 2017-02-19 03:20:39,727 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Setting up resources for TaskManagers
> 2017-02-19 03:20:40,294 INFO  org.apache.flink.yarn.Utils                                   - Copying from file:/data/hadoop/yarn-local/usercache/appweb/appcache/application_1482390799413_0053/container_1482390799413_0053_02_000001/01fac057-b7aa-4f57-9d78-66d2a792d390-taskmanager-conf.yaml to hdfs://10.1.168.10:9000/user/appweb/.flink/application_1482390799413_0053/01fac057-b7aa-4f57-9d78-66d2a792d390-taskmanager-conf.yaml
> 2017-02-19 03:20:41,466 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Prepared local resource for modified yaml: resource { scheme: "hdfs" host: "10.1.168.10" port: 9000 file: "/user/appweb/.flink/application_1482390799413_0053/01fac057-b7aa-4f57-9d78-66d2a792d390-taskmanager-conf.yaml" } size: 1003 timestamp: 1487445641035 type: FILE visibility: APPLICATION
> 2017-02-19 03:20:41,470 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Creating container launch context for TaskManagers
> 2017-02-19 03:20:41,470 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Starting TaskManagers with command: $JAVA_HOME/bin/java -Xms3072m -Xmx3072m -XX:MaxDirectMemorySize=3072m '-verbose:gc -Xloggc:/tmp/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=500m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/java.hprof' -Dlog.file=<LOG_DIR>/taskmanager.log -Dlogback.configurationFile=file:./logback.xml -Dlog4j.configuration=file:./log4j.properties org.apache.flink.yarn.YarnTaskManager --configDir . 1> <LOG_DIR>/taskmanager.out 2> <LOG_DIR>/taskmanager.err
> 2017-02-19 03:20:41,493 INFO  org.apache.flink.runtime.blob.BlobServer                      - Created BLOB server storage directory /tmp/blobStore-12a930b7-5967-4576-bc6a-80c3f3d46654
> 2017-02-19 03:20:41,494 INFO  org.apache.flink.runtime.blob.BlobServer                      - Started BLOB server at 0.0.0.0:31707 - max concurrent requests: 50 - max backlog: 1000
> 2017-02-19 03:20:41,503 INFO  org.apache.flink.runtime.checkpoint.savepoint.SavepointStoreFactory  - Using job manager savepoint state backend.
> 2017-02-19 03:20:41,507 INFO  org.apache.flink.runtime.metrics.MetricRegistry               - No metrics reporter configured, no metrics will be exposed/reported.
> 2017-02-19 03:20:41,510 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - Starting JobManager Web Frontend
> 2017-02-19 03:20:41,514 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started memory archivist akka://flink/user/$a
> 2017-02-19 03:20:41,515 INFO  org.apache.flink.yarn.YarnJobManager                          - Starting JobManager at akka.tcp://flink@10.1.168.7:56782/user/jobmanager.
> 2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Determined location of JobManager log file: /data/logs/hadoop/containers/application_1482390799413_0053/container_1482390799413_0053_02_000001/jobmanager.log
> 2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Determined location of JobManager stdout file: /data/logs/hadoop/containers/application_1482390799413_0053/container_1482390799413_0053_02_000001/jobmanager.out
> 2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using directory /tmp/flink-web-257b6cc9-8161-470b-8c1e-bb5c971c22c6 for the web interface files
> 2017-02-19 03:20:41,517 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using directory /tmp/flink-web-upload-ddd4aa8c-6519-4f48-8af9-33d5ca6d2abd for web frontend JAR file uploads
> 2017-02-19 03:20:41,582 INFO  org.apache.flink.yarn.YarnJobManager                          - JobManager akka.tcp://flink@10.1.168.7:56782/user/jobmanager was granted leadership with leader session ID None.
> 2017-02-19 03:20:41,764 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Web frontend listening at 0.0.0.0:23430
> 2017-02-19 03:20:41,764 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Starting with JobManager akka.tcp://flink@10.1.168.7:56782/user/jobmanager on port 23430
> 2017-02-19 03:20:41,764 INFO  org.apache.flink.runtime.webmonitor.JobManagerRetriever       - New leader reachable under akka://flink/user/jobmanager#1485650694:null.
> 2017-02-19 03:20:41,771 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN application tolerates 6 failed TaskManager containers before giving up
> 2017-02-19 03:20:41,774 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN Application Master started
> 2017-02-19 03:20:41,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Initializing YARN resource master
> 2017-02-19 03:20:41,791 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /10.1.168.10:8030
> 2017-02-19 03:20:41,808 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - yarn.client.max-cached-nodemanagers-proxies : 0
> 2017-02-19 03:20:41,809 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Registering Application Master with tracking url http://dcp-168-7-hzqsh.node.hzqsh.wacai.sdc:23430
> 2017-02-19 03:20:41,910 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Trying to associate with JobManager leader akka://flink/user/jobmanager#1485650694
> 2017-02-19 03:20:41,914 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Resource Manager associating with leading JobManager Actor[akka://flink/user/jobmanager#1485650694] - leader session null
> 2017-02-19 03:20:41,915 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 1
> 2017-02-19 03:20:41,922 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 2
> 2017-02-19 03:20:41,922 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 3
> 2017-02-19 03:20:41,923 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 4
> 2017-02-19 03:20:41,923 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 5
> 2017-02-19 03:20:41,923 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Requesting new TaskManager container with 4096 megabytes memory. Pending requests: 6
> 2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:44636
> 2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:9910
> 2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:57063
> 2017-02-19 03:20:42,954 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:42444
> 2017-02-19 03:20:42,967 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000002 - Remaining pending container requests: 5
> 2017-02-19 03:20:42,968 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445642968: Container: [ContainerId: container_1482390799413_0053_02_000002, NodeId: dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:44636, NodeHttpAddress: dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.11:44636 }, ] on host dcp-168-11-hzqsh.node.hzqsh.wacai.sdc
> 2017-02-19 03:20:42,971 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-11-hzqsh.node.hzqsh.wacai.sdc:44636
> 2017-02-19 03:20:43,014 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000003 - Remaining pending container requests: 4
> 2017-02-19 03:20:43,014 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643014: Container: [ContainerId: container_1482390799413_0053_02_000003, NodeId: dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:9910, NodeHttpAddress: dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.3:9910 }, ] on host dcp-168-3-hzqsh.node.hzqsh.wacai.sdc
> 2017-02-19 03:20:43,015 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-3-hzqsh.node.hzqsh.wacai.sdc:9910
> 2017-02-19 03:20:43,025 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000004 - Remaining pending container requests: 3
> 2017-02-19 03:20:43,025 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643025: Container: [ContainerId: container_1482390799413_0053_02_000004, NodeId: dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:57063, NodeHttpAddress: dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.9:57063 }, ] on host dcp-168-9-hzqsh.node.hzqsh.wacai.sdc
> 2017-02-19 03:20:43,026 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-9-hzqsh.node.hzqsh.wacai.sdc:57063
> 2017-02-19 03:20:43,043 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000005 - Remaining pending container requests: 2
> 2017-02-19 03:20:43,043 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643043: Container: [ContainerId: container_1482390799413_0053_02_000005, NodeId: dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:42444, NodeHttpAddress: dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.1:42444 }, ] on host dcp-168-1-hzqsh.node.hzqsh.wacai.sdc
> 2017-02-19 03:20:43,045 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-1-hzqsh.node.hzqsh.wacai.sdc:42444
> 2017-02-19 03:20:43,461 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:43466
> 2017-02-19 03:20:43,461 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:64371
> 2017-02-19 03:20:43,462 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000006 - Remaining pending container requests: 1
> 2017-02-19 03:20:43,463 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643462: Container: [ContainerId: container_1482390799413_0053_02_000006, NodeId: dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:43466, NodeHttpAddress: dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.6:43466 }, ] on host dcp-168-6-hzqsh.node.hzqsh.wacai.sdc
> 2017-02-19 03:20:43,463 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-6-hzqsh.node.hzqsh.wacai.sdc:43466
> 2017-02-19 03:20:43,482 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Received new container: container_1482390799413_0053_02_000007 - Remaining pending container requests: 0
> 2017-02-19 03:20:43,482 INFO  org.apache.flink.yarn.YarnFlinkResourceManager                - Launching TaskManager in container ContainerInLaunch @ 1487445643482: Container: [ContainerId: container_1482390799413_0053_02_000007, NodeId: dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:64371, NodeHttpAddress: dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:8042, Resource: <memory:5120, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.1.168.5:64371 }, ] on host dcp-168-5-hzqsh.node.hzqsh.wacai.sdc
> 2017-02-19 03:20:43,483 INFO  org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - Opening proxy : dcp-168-5-hzqsh.node.hzqsh.wacai.sdc:64371
> 2017-02-19 03:20:44,244 WARN  org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler     - Error while handling request
> org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job with id 1b45608e30808183913eeffbb4d855da
> 	at org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
> 	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:88)
> 	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:84)
> 	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
> 	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
> 	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:105)
> 	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> 	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
> 	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> 	at java.lang.Thread.run(Thread.java:745)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message