hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stefanlee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3795) ZKRMStateStore crashes due to IOException: Broken pipe
Date Fri, 25 Nov 2016 08:39:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695267#comment-15695267
] 

stefanlee commented on YARN-3795:
---------------------------------

hi ,i have the same problem,but my scenario is that when i failover RM2 to RM1,the zookeeper
in RM1 report watcher num is large, and RM1 is health, then i reboot the zookeeper in RM1,after
that ,i found RM1's web can't access and  a lot of "Broken pipe" message in RM1's log ,and
"java.io.IOException: Len error"  appeared in ZK server 's log ,so i want to  know if your
ZK is health when the above problem occured?

> ZKRMStateStore crashes due to IOException: Broken pipe
> ------------------------------------------------------
>
>                 Key: YARN-3795
>                 URL: https://issues.apache.org/jira/browse/YARN-3795
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.5.0
>            Reporter: lachisis
>            Priority: Critical
>             Fix For: 2.7.1
>
>
> 2015-06-05 06:06:54,848 INFO org.apache.zookeeper.ClientCnxn: Socket connection established
to dap88/134.41.33.88:2181, initiating session
> 2015-06-05 06:06:54,876 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete
on server dap88/134.41.33.88:2181, sessionid = 0x34db2f72ac50c86, negotiated timeout = 10000
> 2015-06-05 06:06:54,881 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
Watcher event type: None with state:SyncConnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore
in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:54,881 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
ZKRMStateStore Session connected
> 2015-06-05 06:06:54,881 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
ZKRMStateStore Session restored
> 2015-06-05 06:06:54,881 WARN org.apache.zookeeper.ClientCnxn: Session 0x34db2f72ac50c86
for server dap88/134.41.33.88:2181, unexpected error, closing socket connection and attempting
reconnect
> java.io.IOException: Broken pipe
> 	at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> 	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> 	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
> 	at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> 	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
> 	at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
> 	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
> 2015-06-05 06:06:54,986 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
Watcher event type: None with state:Disconnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore
in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:54,986 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
ZKRMStateStore Session disconnected
> 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection
to server dap87/134.41.33.87:2181. Will not attempt to authenticate using SASL (unknown error)
> 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Socket connection established
to dap87/134.41.33.87:2181, initiating session
> 2015-06-05 06:06:55,330 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete
on server dap87/134.41.33.87:2181, sessionid = 0x34db2f72ac50c86, negotiated timeout = 10000
> 2015-06-05 06:06:55,343 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
Watcher event type: None with state:SyncConnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore
in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:55,343 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
ZKRMStateStore Session connected
> 2015-06-05 06:06:55,344 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
ZKRMStateStore Session restored
> 2015-06-05 06:06:55,345 WARN org.apache.zookeeper.ClientCnxn: Session 0x34db2f72ac50c86
for server dap87/134.41.33.87:2181, unexpected error, closing socket connection and attempting
reconnect
> java.io.IOException: Broken pipe
> 	at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> 	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> 	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
> 	at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> 	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
> 	at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
> 	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message