flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (Jira)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-19426) Streaming bucketing end-to-end test sometimes fails with "Could not assign resource ... to current execution ..."
Date Tue, 29 Sep 2020 15:12:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-19426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Metzger updated FLINK-19426:
-----------------------------------
    Summary: Streaming bucketing end-to-end test sometimes fails with "Could not assign resource
... to current execution ..."  (was: End-to-end test sometimes fails with PartitionConnectionException)

> Streaming bucketing end-to-end test sometimes fails with "Could not assign resource ...
to current execution ..."
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-19426
>                 URL: https://issues.apache.org/jira/browse/FLINK-19426
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network, Tests
>    Affects Versions: 1.12.0
>            Reporter: Dian Fu
>            Assignee: Robert Metzger
>            Priority: Major
>              Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=6983&view=logs&j=68a897ab-3047-5660-245a-cce8f83859f6&t=16ca2cca-2f63-5cce-12d2-d519b930a729
> {code}
> 2020-09-26T22:16:26.9856525Z org.apache.flink.runtime.io.network.partition.consumer.PartitionConnectionException:
Connection for partition 619775973ed0f282e20f9d55d13913ab#0@bc764cd8ddf7a0cff126f51c16239658_0_1
not reachable.
> 2020-09-26T22:16:26.9857848Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:159)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9859168Z 	at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.internalRequestPartitions(SingleInputGate.java:336)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9860449Z 	at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:308)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9861677Z 	at org.apache.flink.runtime.taskmanager.InputGateWithMetrics.requestPartitions(InputGateWithMetrics.java:95)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9862861Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.requestPartitions(StreamTask.java:542)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9864018Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.readRecoveredChannelState(StreamTask.java:507)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9865284Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:498)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9866415Z 	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9867500Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:492)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9868514Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:550)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9869450Z 	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)
[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9870339Z 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)
[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9870869Z 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
> 2020-09-26T22:16:26.9872060Z Caused by: java.io.IOException: java.util.concurrent.ExecutionException:
org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connecting to
remote task manager '/10.1.0.4:38905' has failed. This might indicate that the remote task
manager has been lost.
> 2020-09-26T22:16:26.9873511Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:85)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9874788Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9876084Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9876567Z 	... 12 more
> 2020-09-26T22:16:26.9877477Z Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException:
Connecting to remote task manager '/10.1.0.4:38905' has failed. This might indicate that the
remote task manager has been lost.
> 2020-09-26T22:16:26.9878503Z 	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
~[?:1.8.0_265]
> 2020-09-26T22:16:26.9879061Z 	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
~[?:1.8.0_265]
> 2020-09-26T22:16:26.9880244Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:83)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9884461Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9885737Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9886304Z 	... 12 more
> 2020-09-26T22:16:26.9887211Z Caused by: org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException:
Connecting to remote task manager '/10.1.0.4:38905' has failed. This might indicate that the
remote task manager has been lost.
> 2020-09-26T22:16:26.9888456Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:122)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9889704Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connectWithRetries(PartitionRequestClientFactory.java:101)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9891028Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.lambda$createPartitionRequestClient$1(PartitionRequestClientFactory.java:78)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9892193Z 	at org.apache.flink.runtime.concurrent.FutureUtils.completeFromCallable(FutureUtils.java:87)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9893396Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:78)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9894646Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9895718Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9896201Z 	... 12 more
> 2020-09-26T22:16:26.9896424Z Caused by: java.lang.NullPointerException
> 2020-09-26T22:16:26.9897066Z 	at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:58)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9898008Z 	at org.apache.flink.runtime.io.network.netty.NettyPartitionRequestClient.<init>(NettyPartitionRequestClient.java:73)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9899040Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:116)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9900118Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connectWithRetries(PartitionRequestClientFactory.java:101)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9901443Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.lambda$createPartitionRequestClient$1(PartitionRequestClientFactory.java:78)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9902613Z 	at org.apache.flink.runtime.concurrent.FutureUtils.completeFromCallable(FutureUtils.java:87)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9904043Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:78)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9905404Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9906893Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156)
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
> 2020-09-26T22:16:26.9907510Z 	... 12 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message