Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1D4EA10C22 for ; Thu, 27 Mar 2014 02:54:21 +0000 (UTC) Received: (qmail 77656 invoked by uid 500); 27 Mar 2014 02:54:19 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 77306 invoked by uid 500); 27 Mar 2014 02:54:18 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 76313 invoked by uid 99); 27 Mar 2014 02:54:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Mar 2014 02:54:16 +0000 Date: Thu, 27 Mar 2014 02:54:16 +0000 (UTC) From: "guodongdong (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: =?utf-8?Q?[jira]_[Updated]_(HDFS-6154)_Improve_the_speed_of_sa?= =?utf-8?Q?veNameSpace=EF=BC=8Cmaking_HDFS_restart_and_checkPoint_faster?= MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-6154?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:all-tabpanel ] guodongdong updated HDFS-6154: ------------------------------ Description:=20 There are two stages when namenode saving namespace, serializes INode, calc= ulates MD5 and writes to disk. Now, two stages are doing serially, For imp= rovement, one thread serializes INode, and another thread calculates MD5 = and writes to disk. It doubles the speed of saving namespace, Details are s= howed as below: Test environment: only test namenode saving namespace, dfsadmin -saveNamespace machine: 144GB, Intel(R) Xeon(R) CPU E5645 @ 2.40GHz, 12 cpu, Raid 5 SA= S Disk, jdk 1.7.0 =20 ||image size||before optimizing||after optimizing || |1.2GB|22sec|11sec| |4.3GB|66sec|36sec| |22GB|406sec|250sec| was: There are two stage In namenode savenamespace, serializing INode, calculat= e MD5 and write to disk. Now, two stage is doing serially, In this improve= ment, it is doing parallel, one thread do serializing INode, other thread = do calculating MD5 and writing to disk, it double speed of savenamespace, D= etail is show in table: Testing environment: only test namenode savenamespace, dfsadmin -saveNamespace machine: 144GB, Intel(R) Xeon(R) CPU E5645 @ 2.40GHz, 12 cpu, Raid 5 = SAS Disk, jdk 1.7.0 =20 ||image size||before optimizing||after optimizing || |1.2GB|22sec|11sec| |4.3GB|66sec|36sec| |22GB|406sec|250sec| > Improve the speed of saveNameSpace=EF=BC=8Cmaking HDFS restart and checkP= oint faster > -------------------------------------------------------------------------= --- > > Key: HDFS-6154 > URL: https://issues.apache.org/jira/browse/HDFS-6154 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.3.0 > Reporter: guodongdong > Attachments: HDFS-6154-new-patch > > > There are two stages when namenode saving namespace, serializes INode, ca= lculates MD5 and writes to disk. Now, two stages are doing serially, For i= mprovement, one thread serializes INode, and another thread calculates MD= 5 and writes to disk. It doubles the speed of saving namespace, Details are= showed as below: > Test environment: > only test namenode saving namespace, dfsadmin -saveNamespace > machine: 144GB, Intel(R) Xeon(R) CPU E5645 @ 2.40GHz, 12 cpu, Raid 5 = SAS Disk, jdk 1.7.0 > =20 > ||image size||before optimizing||after optimizing || > |1.2GB|22sec|11sec| > |4.3GB|66sec|36sec| > |22GB|406sec|250sec| -- This message was sent by Atlassian JIRA (v6.2#6252)