Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 73FC6200C00 for ; Wed, 4 Jan 2017 02:21:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 72967160B43; Wed, 4 Jan 2017 01:21:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BC29F160B46 for ; Wed, 4 Jan 2017 02:20:59 +0100 (CET) Received: (qmail 88653 invoked by uid 500); 4 Jan 2017 01:20:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 88508 invoked by uid 99); 4 Jan 2017 01:20:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jan 2017 01:20:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8FED42C2A6A for ; Wed, 4 Jan 2017 01:20:58 +0000 (UTC) Date: Wed, 4 Jan 2017 01:20:58 +0000 (UTC) From: "Gang Xie (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7784) load fsimage in parallel MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 04 Jan 2017 01:21:00 -0000 [ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796797#comment-15796797 ] Gang Xie commented on HDFS-7784: -------------------------------- The JVM setting: -Xmx102400m -Xms102400m -Xmn5508m -XX:MaxDirectMemorySize=3686m -XX:MaxPermSize=1024m -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6 -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled -XX:+UseNUMA -XX:+CMSClassUnloadingEnabled -XX:CMSMaxAbortablePrecleanTime=10000 -XX:TargetSurvivorRatio=80 -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100 -XX:GCLogFileSize=128m -XX:CMSWaitDuration=8000 -XX:+CMSScavengeBeforeRemark -XX:ConcGCThreads=16 -XX:ParallelGCThreads=16 -XX:+CMSConcurrentMTEnabled -XX:+SafepointTimeout -XX:MonitorBound=16384 -XX:-UseBiasedLocking -XX:MaxTenuringThreshold=3 -XX:+ParallelRefProcEnabled -XX:-OmitStackTraceInFastThrow > load fsimage in parallel > ------------------------ > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: Walter Su > Assignee: Walter Su > Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org