flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zili Chen <wander4...@gmail.com>
Subject Re: flink ha模式进程hang!!!
Date Mon, 25 Mar 2019 11:53:26 GMT
能提供你的 ha 配置吗?特别是 high-availability.storageDir,我怀疑是不是没有配置这个啊
Best,
tison.


Han Xiao <xiaoh20@chinaunicom.cn> 于2019年3月25日周一 下午7:26写道:

>         各位朋友大家好,我是flink初学者,部署flink ha的过程中出现一些问题,麻烦大家帮忙看下;
> 启动flink ha后,jobmanager进程直接hang,使用的flink 1.7.2版本,下面log中有一处出现此错误
 File does
> not exist: /flink/ha/zookeeper/submittedJobGraphb05001535f91
> ,让我不解的是我的checkpoint目录以及ha目录并不是这个,为什么会到这个目录去找,我所配置的目录下没有生成JobGraph
,他会一直去检索
> /a5ffe00b0bc5688d9a7de5c62b8150e6
> 这个作业图而且找不到,我删除了所有相关的配置路径之后重新搭建,启动时还是会去检索,我该怎样避免flink去检索这个JobGraph
> ,让我的ha群集健康的运行起来。
>
>
> 报错日志:
> 2019-03-25 18:55:00,742 ERROR
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Fatal error
> occurred in the cluster entrypoint.
> java.lang.RuntimeException: org.apache.flink.util.FlinkException: Could
> not retrieve submitted JobGraph from state handle under
> /a5ffe00b0bc5688d9a7de5c62b8150e6. This indicates that the retrieved state
> handle is broken. Try cleaning the state handle store.
>         at
> org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199)
>         at
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:74)
>         at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> .......
> Caused by: org.apache.flink.util.FlinkException: Could not retrieve
> submitted JobGraph from state handle under
> /a5ffe00b0bc5688d9a7de5c62b8150e6. This indicates that the retrieved state
> handle is broken. Try cleaning the state handle store.
>         at
> org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:208)
>         at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJob(Dispatcher.java:696)
>         at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobGraphs(Dispatcher.java:681)
> ........
> Caused by: java.io.FileNotFoundException: File does not exist:
> /flink/ha/zookeeper/submittedJobGraphb05001535f91
>         at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
>         at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2100)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2070)
> .......
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException):
> File does not exist: /flink/ha/zookeeper/submittedJobGraphb05001535f91
>         at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
>         at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2100)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2070)
> .......
>
> 谢谢!
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message