Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 882D09667 for ; Thu, 9 Feb 2012 05:26:38 +0000 (UTC) Received: (qmail 51993 invoked by uid 500); 9 Feb 2012 05:26:38 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 51608 invoked by uid 500); 9 Feb 2012 05:26:35 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 49407 invoked by uid 99); 9 Feb 2012 05:26:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Feb 2012 05:26:22 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Feb 2012 05:26:20 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 6AA621AC34C for ; Thu, 9 Feb 2012 05:25:59 +0000 (UTC) Date: Thu, 9 Feb 2012 05:25:59 +0000 (UTC) From: "Todd Lipcon (Commented) (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <88552533.18160.1328765159438.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1770088487.17728.1328753279374.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HADOOP-8041) HA: log a warning when a failover is first attempted MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204274#comment-13204274 ] Todd Lipcon commented on HADOOP-8041: ------------------------------------- My thinking is the following: For a given proxy object, when we first have a successful RPC, we can set an internal flag in the failover proxy provider indicating that it has connected once. Then, whenever we do a failover, if that flag is set, then we should print a warning. Otherwise, only print it at DEBUG level. This would allow FsShell commands to not print warnings due to an "already been failed over for a while" situation, but still cause an INFO msg to be printed in MR tasks or HBase servers if a failover takes place while they're accessing DFS. > HA: log a warning when a failover is first attempted > ----------------------------------------------------- > > Key: HADOOP-8041 > URL: https://issues.apache.org/jira/browse/HADOOP-8041 > Project: Hadoop Common > Issue Type: Sub-task > Components: ha > Affects Versions: HA Branch (HDFS-1623) > Reporter: Eli Collins > > Currently we always warn for each client operation made to a NN we've failed over to: > {noformat} > hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfs -lsr / > 12/02/08 17:43:04 WARN retry.RetryInvocationHandler: Exception while invoking getFileInfo of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately. > {noformat} > I'm going to remove this warning in HDFS-2918 since we shouldn't warn every time a client performs an operation, eg could be weeks after the failover. But we should eg log a warning eg the client first does a failover. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira