Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8EF9E7E03 for ; Thu, 3 Nov 2011 19:45:54 +0000 (UTC) Received: (qmail 39709 invoked by uid 500); 3 Nov 2011 19:45:53 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 39613 invoked by uid 500); 3 Nov 2011 19:45:53 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 39507 invoked by uid 99); 3 Nov 2011 19:45:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2011 19:45:53 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2011 19:45:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 8C909330AE7 for ; Thu, 3 Nov 2011 19:45:32 +0000 (UTC) Date: Thu, 3 Nov 2011 19:45:32 +0000 (UTC) From: "Nathan Roberts (Created) (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: <802123958.57055.1320349532592.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (HDFS-2537) re-replicating under replicated blocks should be more dynamic MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 re-replicating under replicated blocks should be more dynamic ------------------------------------------------------------- Key: HDFS-2537 URL: https://issues.apache.org/jira/browse/HDFS-2537 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.20.205.0, 0.23.0 Reporter: Nathan Roberts When a node fails or is decommissioned, a large number of blocks become under-replicated. Since re-replication work is distributed, the hope would be that all blocks could be restored to their desired replication factor in very short order. This doesn't happen though because the load the cluster is willing to devote to this activity is mostly static (controlled by configuration variables). Since it's mostly static, the rate has to be set conservatively to avoid overloading the cluster with replication work. This problem is especially noticeable when you have lots of small blocks. It can take many hours to re-replicate the blocks that were on a node while the cluster is mostly idle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira