hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6507) Improve DFSAdmin to support HA cluster better
Date Tue, 17 Jun 2014 16:13:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033963#comment-14033963

Hadoop QA commented on HDFS-6507:

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:


    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7147//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7147//console

This message is automatically generated.

> Improve DFSAdmin to support HA cluster better
> ---------------------------------------------
>                 Key: HDFS-6507
>                 URL: https://issues.apache.org/jira/browse/HDFS-6507
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 2.4.0
>            Reporter: Zesheng Wu
>            Assignee: Zesheng Wu
>         Attachments: HDFS-6507.1.patch, HDFS-6507.2.patch, HDFS-6507.3.patch, HDFS-6507.4-inprogress.patch,
HDFS-6507.4.patch, HDFS-6507.5.patch, HDFS-6507.6.patch
> Currently, the commands supported in DFSAdmin can be classified into three categories
according to the protocol used:
> 1. ClientProtocol
> Commands in this category generally implement by calling the corresponding function of
the DFSClient class, and will call the corresponding remote implementation function at the
NN side finally. At the NN side, all these operations are classified into five categories:
UNCHECKED, READ, WRITE, CHECKPOINT, JOURNAL. Active NN will allow all operations, and Standby
NN only allows UNCHECKED operations. In the current implementation of DFSClient, it will connect
one NN first, if the first NN is not Active and the operation is not allowed, it will failover
to the second NN. So here comes the problem, some of the commands(setSafeMode, saveNameSpace,
restoreFailedStorage, refreshNodes, setBalancerBandwidth, metaSave) in DFSAdmin are classified
as UNCHECKED operations, and when executing these commands in the DFSAdmin command line, they
will be sent to a definite NN, no matter it is Active or Standby. This may result in two problems:

> a. If the first tried NN is standby, and the operation takes effect only on Standby NN,
which is not the expected result.
> b. If the operation needs to take effect on both NN, but it takes effect on only one
NN. In the future, when there is a NN failover, there may have problems.
> Here I propose the following improvements:
> a. If the command can be classified as one of READ/WRITE/CHECKPOINT/JOURNAL operations,
we should classify it clearly.
> b. If the command can not be classified as one of the above four operations, or if the
command needs to take effect on both NN, we should send the request to both Active and Standby
> 2. Refresh protocols: RefreshAuthorizationPolicyProtocol, RefreshUserMappingsProtocol,
RefreshUserMappingsProtocol, RefreshCallQueueProtocol
> Commands in this category, including refreshServiceAcl, refreshUserToGroupMapping, refreshSuperUserGroupsConfiguration
and refreshCallQueue, are implemented by creating a corresponding RPC proxy and sending the
request to remote NN. In the current implementation, these requests will be sent to a definite
NN, no matter it is Active or Standby. Here I propose that we sent these requests to both
> 3. ClientDatanodeProtocol
> Commands in this category are handled correctly, no need to improve.

This message was sent by Atlassian JIRA

View raw message