hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6706) ZKFailoverController failed to recognize the quorum is not met
Date Mon, 21 Jul 2014 20:30:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069226#comment-14069226
] 

Yongjun Zhang commented on HDFS-6706:
-------------------------------------

HI [~brandonli], 

Thanks for reporting and addressing the issue.  I have some questions here.  The original
report seems to indicate that the reported error message doesn't indicate the real reason
of failure. My questions are,
1. In the case reported initially, the real problem was said to be "The zkfc cannot be startup
due to ha.zookeeper.quorum is not met". With your last update, can we say the real problem
is a misconfiguration?
2. What kind of misconfiguration caused the symptom?
3. When misconfigured, user will still see the reported error message. Should we have the
error message to tell that the symptom is caused by the possible misconfiguration?

Thanks.
.


> ZKFailoverController failed to recognize the quorum is not met
> --------------------------------------------------------------
>
>                 Key: HDFS-6706
>                 URL: https://issues.apache.org/jira/browse/HDFS-6706
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>
> Thanks Kenny Zhang for finding this problem.
> The zkfc cannot be startup due to ha.zookeeper.quorum is not met. "zkfc -format" doesn't
log the real problem. And then user will see the error message instead of the real issue when
starting zkfc:
> 2014-07-01 17:08:17,528 FATAL ha.ZKFailoverController (ZKFailoverController.java:doRun(213))
- Unable to start failover controller. Parent znode does not exist.
> Run with -formatZK flag to initialize ZooKeeper.
> 2014-07-01 16:00:48,678 FATAL ha.ZKFailoverController (ZKFailoverController.java:fatalError(365))
- Fatal error occurred:Received create error from Zookeeper. code:NONODE for path /hadoop-ha/prodcluster/ActiveStandbyElectorLock
> 2014-07-01 17:24:44,202 - INFO ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@627
- Got user-level KeeperException when processing sessionid:0x346f36191250005 type:create cxid:0x2
zxid:0xf00000033 txntype:-1 reqpath:n/a Error Path:/hadoop-ha/prodcluster/ActiveStandbyElectorLock
Error:KeeperErrorCode = NodeExists for /hadoop-ha/prodcluster/ActiveStandbyElectorLock
> To reproduce the problem:
> 1. use HDFS cluster with automatic HA enable and set the ha.zookeeper.quorum to 3.
> 2. start two zookeeper servers.
> 3. do "hdfs zkfc -format", and then "hdfs zkfc"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message