flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5685) Connection leak in Taskmanager
Date Fri, 19 May 2017 12:19:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017318#comment-16017318
] 

Andrey commented on FLINK-5685:
-------------------------------

{code}
#: ulimit -n
1024
{code}

This issue caused by https://issues.apache.org/jira/browse/FLINK-3347

> Connection leak in Taskmanager
> ------------------------------
>
>                 Key: FLINK-5685
>                 URL: https://issues.apache.org/jira/browse/FLINK-5685
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>            Reporter: Andrey
>            Priority: Critical
>
> Steps to reproduce:
> * setup cluster with the following configuration: 1 job manager, 2 task managers
> * job manager starts rejecting connection attempts from task manager.
> {code}
> 2017-01-30 03:24:42,908 INFO  org.apache.flink.runtime.taskmanager.TaskManager      
       - Trying to register at JobManager akka.tcp://flink@ip:6123/user/jobmanager (attempt
4326, timeout: 30 seconds)
> 2017-01-30 03:24:42,913 WARN  Remoting                                              
       - Tried to associate with unreachable remote address [akka.tcp://flink@ip:6123]. Address
is now gated for 5000 ms, all messages to this
>  address will be delivered to dead letters. Reason: The remote system has quarantined
this system. No further associations to the remote system are possible until this system is
restarted.
> {code}
> * task manager tries multiple times. (looks like it doens't close connection after failure)
> * job manager unable to process any messages. In logs:
> {code}
> 2017-01-30 03:25:12,932 WARN  org.jboss.netty.channel.socket.nio.AbstractNioSelector
       - Failed to accept a connection.
> java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>         at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>         at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100)
>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>         at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message