Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 645FD200B91 for ; Thu, 29 Sep 2016 17:06:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6347E160AE8; Thu, 29 Sep 2016 15:06:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 983C1160AD7 for ; Thu, 29 Sep 2016 17:06:22 +0200 (CEST) Received: (qmail 84431 invoked by uid 500); 29 Sep 2016 15:06:21 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 84316 invoked by uid 99); 29 Sep 2016 15:06:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Sep 2016 15:06:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C2C6C2C2A68 for ; Thu, 29 Sep 2016 15:06:20 +0000 (UTC) Date: Thu, 29 Sep 2016 15:06:20 +0000 (UTC) From: "Daryn Sharp (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-10887) Provide admin/debug tool to dump block map MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 29 Sep 2016 15:06:23 -0000 [ https://issues.apache.org/jira/browse/HDFS-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533020#comment-15533020 ] Daryn Sharp commented on HDFS-10887: ------------------------------------ Agree with Kihwal that a tool that lists which nodes in the include list haven't block reported would be far more useful. Technically, traversing a blocks map containing over well over 1/4 billion blocks isn't going to be cheap. Ignoring that... Operationally and in practice, the results of a block scan wouldn't be useful. Finding the blocks with a distributed find (which I hope you aren't suggesting hdfs should support) presumes that nodes with "missing" blocks are online. In most cases the problem is a hardware failure, ex. host or switch or failed volume. You can't find what's not available. Knowing which hosts are dead, not reporting, have failed storages lets you immediately know where to focus attention. If the node is down, bring it up. If it's up and hasn't reported, use dfsadmin to force a block report. > Provide admin/debug tool to dump block map > ------------------------------------------ > > Key: HDFS-10887 > URL: https://issues.apache.org/jira/browse/HDFS-10887 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs, namenode > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Attachments: HDFS-10887.001.patch, HDFS-10887.002.patch > > > From time to time, when NN restarts, we see > {code} > "The reported blocks X needs additional Y blocks to reach the threshold 0.9990 of total blocks Z. Safe mode will be turned off automatically. > {code} > We'd wonder what these blocks that still need block reports are, and what DNs they could possibly be located, what happened to these DNs. > This jira to to propose a new admin or debug tool to dump the block map info with the blocks that have fewer than minRepl replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org