hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rushabh S Shah (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-7059) HAadmin transtionToActive with forceActive option can show confusing message.
Date Fri, 12 Sep 2014 21:46:35 GMT

     [ https://issues.apache.org/jira/browse/HDFS-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rushabh S Shah updated HDFS-7059:
---------------------------------
    Attachment: HDFS-7059.patch

This info log was called in HAAdmin#isOtherTargetNodeActive method.
So just added a check before calling that method whether forceActive switch is enabled or
not.
After the change, it is the administrator's responsibility not to transition to Active with
force Active option enabled when other namenode is already active. 
So removed a test case from TestDFSHAAdminMiniCluster#testTransitionToActiveWhenOtherNamenodeisActive
which was checking whether any namenode is active before transitionToActive was called with
forceActive option enabled.


> HAadmin transtionToActive with forceActive option can show confusing message.
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-7059
>                 URL: https://issues.apache.org/jira/browse/HDFS-7059
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HDFS-7059.patch
>
>
> Ran into this confusing message on our local HA setup.
> One of the namenode was down and the other was in standby mode.
> The namenode was not able to come out of safe mode so we did transitionToActive with
forceActive switch enabled.
> Due to change in HDFS-2949,  it will try connecting to all the  namenode  to see whether
they are active or not.
> But since the other namenode is down it will try connect to that namenode for 'ipc.client.connect.max.retries'
number of times.
> Every time it is not able to connect, it will log a message :
> INFO ipc.Client: Retrying connect to server: <server name>. Already tried 0 time(s);
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
> Since in our configuration, the number of retries is 50, it will show this message 50
times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message