hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "surendra singh lilhore (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10722) Standby NN continuing as standby when active NN machine got shutdown.
Date Thu, 19 Jun 2014 10:45:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037221#comment-14037221
] 

surendra singh lilhore commented on HADOOP-10722:
-------------------------------------------------

@vinay

Thanks

Its working fine

{noformat}
2014-06-19 16:29:26,083 INFO org.apache.hadoop.ha.NodeFencer: ====== Beginning Service Fencing
Process... ======
2014-06-19 16:29:26,083 INFO org.apache.hadoop.ha.NodeFencer: Trying method 1/1: org.apache.hadoop.ha.ShellCommandFencer(/bin/true)
2014-06-19 16:29:26,129 INFO org.apache.hadoop.ha.ShellCommandFencer: Launched fencing command
'/bin/true' with pid 24316
2014-06-19 16:29:26,168 INFO org.apache.hadoop.ha.NodeFencer: ====== Fencing successful by
method org.apache.hadoop.ha.ShellCommandFencer(/bin/true) ======
2014-06-19 16:29:26,168 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing znode /hadoop-ha/mycluster/ActiveBreadCrumb
to indicate that the local node is the most recent active...
2014-06-19 16:29:26,206 INFO org.apache.hadoop.ha.ZKFailoverController: Trying to make NameNode
at host-10-18-40-90/10.18.40.90:8020 active...
2014-06-19 16:29:26,862 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned
NameNode at host-10-18-40-90/10.18.40.90:8020 to active state
{noformat}

> Standby NN continuing as standby when active NN machine got shutdown.
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-10722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: auto-failover, ha
>    Affects Versions: 2.4.0
>            Reporter: surendra singh lilhore
>
> I have HA cluster with 3 ZK, 3 QJM.
> My Active NN machine got shutdown, but still my standby NN is standby only.
> It should be active
> ZKFC logs
> ========
> {noformat}
> 2014-06-19 13:39:30,810 INFO org.apache.hadoop.ha.NodeFencer: ====== Beginning Service
Fencing Process... ======
> 2014-06-19 13:39:30,810 INFO org.apache.hadoop.ha.NodeFencer: Trying method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)
> 2014-06-19 13:39:30,811 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Connecting to host-10-18-40-101...
> 2014-06-19 13:39:30,811 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Connecting
to host-10-18-40-101 port 22
> 2014-06-19 13:39:33,814 WARN org.apache.hadoop.ha.SshFenceByTcpPort: Unable to connect
to host-10-18-40-101 as user myuser
> com.jcraft.jsch.JSchException: java.net.NoRouteToHostException: No route to host
> 	at com.jcraft.jsch.Util.createSocket(Util.java:386)
> 	at com.jcraft.jsch.Session.connect(Session.java:182)
> 	at org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100)
> 	at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97)
> 	at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:521)
> 	at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494)
> 	at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59)
> 	at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837)
> 	at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:901)
> 	at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:800)
> 	at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2014-06-19 13:39:33,814 WARN org.apache.hadoop.ha.NodeFencer: Fencing method org.apache.hadoop.ha.SshFenceByTcpPort(null)
was unsuccessful.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message