hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended
Date Wed, 19 Mar 2014 23:09:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941138#comment-13941138
] 

Andrew Wang commented on HDFS-6089:
-----------------------------------

bq. the current SBN auto roller can cause the same issue like "a lot of edit log segments
aren't being cleared out" in case that checkpoints are broken (but the SBN is not down), right?

Yea, I think so. Checkpointing is pretty stable nowadays though, so I think this is pretty
rare :) Thanks Jing.

> Standby NN while transitioning to active throws a connection refused error when the prior
active NN process is suspended
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6089
>                 URL: https://issues.apache.org/jira/browse/HDFS-6089
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.4.0
>            Reporter: Arpit Gupta
>            Assignee: Jing Zhao
>         Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to active.
> What was noticed that some times the call to get the service state of nn2 got a socket
time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message