Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 47AFA10FC6 for ; Wed, 15 Jan 2014 19:56:33 +0000 (UTC) Received: (qmail 94629 invoked by uid 500); 15 Jan 2014 19:56:22 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 94492 invoked by uid 500); 15 Jan 2014 19:56:22 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 94311 invoked by uid 99); 15 Jan 2014 19:56:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Jan 2014 19:56:21 +0000 Date: Wed, 15 Jan 2014 19:56:21 +0000 (UTC) From: "Arpit Agarwal (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-5153) Datanode should stagger block reports from individual storages MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-5153: -------------------------------- Description: When the number of blocks on the DataNode grows large we start running into a few issues: # Block reports take a long time to process on the NameNode. In testing we have seen that a block report with 6 Million blocks takes close to one second to process on the NameNode. The NameSystem write lock is held during this time. # We start hitting the default protobuf message limit of 64MB somewhere around 10 Million blocks. While we can increase the message size limit it already takes over 7 seconds to serialize/unserialize a block report of this size. HDFS-2832 has introduced the concept of a DataNode as a collection of storages i.e. the NameNode is aware of all the volumes (storage directories) attached to a given DataNode. this Takes it easy to split block reports from the DN by sending one report per storage directory to mitigate the above problems. was: When the number of blocks on the DataNode grows large we start running into a few issues: # Block reports take a long time to process on the NameNode. In testing we have seen that a block report with 6 Million blocks takes close to one second to process on the NameNode. The NameSystem write lock is held during this time. # We start hitting the default protobuf message limit of 64MB somewhere around 10 Million blocks. While we can increase the message size limit it already takes over 7 seconds to serialize/unserialize a block report of this size. HDFS-2832 has introduced the concept of a DataNode as a collection of storages i.e. the NameNode is aware of all the volumes attached to a given DataNode. this makes it easy to split block reports from the DN by sending one report per attached storage to mitigate the above problems. > Datanode should stagger block reports from individual storages > -------------------------------------------------------------- > > Key: HDFS-5153 > URL: https://issues.apache.org/jira/browse/HDFS-5153 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 3.0.0 > Reporter: Arpit Agarwal > Attachments: HDFS-5153.01.patch > > > When the number of blocks on the DataNode grows large we start running into a few issues: > # Block reports take a long time to process on the NameNode. In testing we have seen that a block report with 6 Million blocks takes close to one second to process on the NameNode. The NameSystem write lock is held during this time. > # We start hitting the default protobuf message limit of 64MB somewhere around 10 Million blocks. While we can increase the message size limit it already takes over 7 seconds to serialize/unserialize a block report of this size. > HDFS-2832 has introduced the concept of a DataNode as a collection of storages i.e. the NameNode is aware of all the volumes (storage directories) attached to a given DataNode. this Takes it easy to split block reports from the DN by sending one report per storage directory to mitigate the above problems. -- This message was sent by Atlassian JIRA (v6.1.5#6160)