flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Ewen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2455) Misleading I/O manager error log messages
Date Sun, 02 Aug 2015 15:24:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651049#comment-14651049
] 

Stephan Ewen commented on FLINK-2455:
-------------------------------------

I see. Asynchronous channels are flushed when closing, so that it is guaranteed that any error
during write is seen until the close is complete.

What would be needed is a way to close the FileHandle and delete the file without waiting
for pending I/O requests. Those would then fail, but that is not too bad, as long as it is
guaranteed that the memory segments are properly recycled.

> Misleading I/O manager error log messages
> -----------------------------------------
>
>                 Key: FLINK-2455
>                 URL: https://issues.apache.org/jira/browse/FLINK-2455
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Runtime
>    Affects Versions: 0.9, master
>            Reporter: Ufuk Celebi
>             Fix For: 0.10, 0.9.1
>
>
> The logs reported by [~andralungu] in FLINK-2412 show a lot of the following messages:
> {code}
> 20:13:27,504 WARN  org.apache.flink.runtime.taskmanager.Task                     - Task
'CHAIN DataSource (at getEdgesDataSet(Degrees.java:64) (org.apache.flink.api.java.io.CsvInputFormat))
-> Map (Map at getEdgesDataSet(Degrees.java:64)) (50/60)' did not react to cancelling signal,
but is stuck in method:
>  java.lang.Object.wait(Native Method)
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.close(AsynchronousFileIOChannel.java:126)
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.closeAndDelete(AsynchronousFileIOChannel.java:158)
> org.apache.flink.runtime.io.network.partition.SpillableSubpartition.release(SpillableSubpartition.java:130)
> org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:300)
> org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartitionsProducedBy(ResultPartitionManager.java:95)
> org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:356)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:674)
> java.lang.Thread.run(Thread.java:722)
> 20:13:27,583 ERROR org.apache.flink.runtime.io.network.partition.ResultPartition  - Error
during release of result subpartition: Closing of asynchronous file channel was interrupted.
> java.io.IOException: Closing of asynchronous file channel was interrupted.
> 	at org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.close(AsynchronousFileIOChannel.java:130)
> 	at org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.closeAndDelete(AsynchronousFileIOChannel.java:158)
> 	at org.apache.flink.runtime.io.network.partition.SpillableSubpartition.release(SpillableSubpartition.java:130)
> 	at org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:300)
> 	at org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartitionsProducedBy(ResultPartitionManager.java:95)
> 	at org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:356)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:674)
> 	at java.lang.Thread.run(Thread.java:722)
> {code}
> This is repeated for each subpartition during the release of a spillable partition (each
subpartition is closed idp). The task is interrupted while waiting for the file channel to
be closed.
> {code}
> 20:15:50,329 ERROR org.apache.flink.runtime.io.network.partition.ResultPartition  - Error
during release of result subpartition: IO-Manager has been closed.
> java.io.IOException: IO-Manager has been closed.
> 	at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$WriterThread.shutdown(IOManagerAsync.java:424)
> 	at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.shutdown(IOManagerAsync.java:125)
> 	at org.apache.flink.runtime.io.disk.iomanager.IOManager$1.run(IOManager.java:103)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message