Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4FFC79E42 for ; Tue, 19 Jun 2012 04:06:47 +0000 (UTC) Received: (qmail 12164 invoked by uid 500); 19 Jun 2012 04:06:46 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 11990 invoked by uid 500); 19 Jun 2012 04:06:46 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 11962 invoked by uid 99); 19 Jun 2012 04:06:45 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Jun 2012 04:06:45 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 00A1E142864 for ; Tue, 19 Jun 2012 04:06:44 +0000 (UTC) Date: Tue, 19 Jun 2012 04:06:43 +0000 (UTC) From: "Vinay (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <395913525.28327.1340078804004.JavaMail.jiratomcat@issues-vm> In-Reply-To: <4007362.39761288982082841.JavaMail.jira@thor> Subject: [jira] [Commented] (HDFS-1490) TransferFSImage should timeout MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396494#comment-13396494 ] Vinay commented on HDFS-1490: ----------------------------- Hi Molkov, We also faced same problem in 2.0.1. Are you planning post any patch on this..? > TransferFSImage should timeout > ------------------------------ > > Key: HDFS-1490 > URL: https://issues.apache.org/jira/browse/HDFS-1490 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Reporter: Dmytro Molkov > Assignee: Dmytro Molkov > Priority: Minor > > Sometimes when primary crashes during image transfer secondary namenode would hang trying to read the image from HTTP connection forever. > It would be great to set timeouts on the connection so if something like that happens there is no need to restart the secondary itself. > In our case restarting components is handled by the set of scripts and since the Secondary as the process is running it would just stay hung until we get an alarm saying the checkpointing doesn't happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira