flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "claylin" <1012539...@qq.com>
Subject 从checkpoint恢复任务失败
Date Thu, 16 Jan 2020 14:00:39 GMT
用的版本1.9.1,我这里只要遇到异常,譬如空指针异常,然后从checkpoint恢复,总是恢复失败,报找不到sst文件的错误,错误堆栈如下:
2020-01-16 19:29:39
java.lang.Exception: Exception while creating StreamOperatorStateContext.
	at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:195)
	at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:253)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:881)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:395)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state backend for
KeyedProcessOperator_8ea7af242b2bcc2d11daf69b5d588c4d_(31/32) from any of the 1 provided restore
options.
	at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
	at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:307)
	at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:135)
	... 6 more
Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught unexpected exception.
	at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:326)
	at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:520)
	at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:291)
	at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
	at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
	... 8 more
Caused by: java.nio.file.NoSuchFileException: /data/hadoop/tmp/nm-local-dir/usercache/www-data/appcache/application_1579002711906_0001/flink-io-bd910c0d-03c7-48ff-8712-4e7059bac574/job_bbe797c8fcdf7c362bed774435ae5f86_op_KeyedProcessOperator_8ea7af242b2bcc2d11daf69b5d588c4d__31_32__uuid_7259cf96-aa16-423e-a356-dcac0a7859f2/db/000019.sst
-&gt; /data/hadoop/tmp/nm-local-dir/usercache/www-data/appcache/application_1579002711906_0001/flink-io-bd910c0d-03c7-48ff-8712-4e7059bac574/job_bbe797c8fcdf7c362bed774435ae5f86_op_KeyedProcessOperator_8ea7af242b2bcc2d11daf69b5d588c4d__31_32__uuid_7259cf96-aa16-423e-a356-dcac0a7859f2/40e6dc65-7fac-41ae-b736-91c4ecd5e296/000019.sst
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixFileSystemProvider.createLink(UnixFileSystemProvider.java:476)
	at java.nio.file.Files.createLink(Files.java:1086)
	at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreInstanceDirectoryFromPath(RocksDBIncrementalRestoreOperation.java:473)
	at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreFromLocalState(RocksDBIncrementalRestoreOperation.java:212)
	at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreFromRemoteState(RocksDBIncrementalRestoreOperation.java:188)
	at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithoutRescaling(RocksDBIncrementalRestoreOperation.java:162)
	at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:148)
	at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:270)
	... 12 more



从错误日志看,首先从远程dfs下载checkpoint,下载到本地后,再做链接,但是在链接过程报错找不到文件,这个难道是权限问题吗
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message