Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 380B71041B for ; Thu, 2 Jan 2014 21:00:51 +0000 (UTC) Received: (qmail 21231 invoked by uid 500); 2 Jan 2014 21:00:51 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 21133 invoked by uid 500); 2 Jan 2014 21:00:50 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 21069 invoked by uid 99); 2 Jan 2014 21:00:50 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jan 2014 21:00:50 +0000 Date: Thu, 2 Jan 2014 21:00:50 +0000 (UTC) From: "Rohan Pasalkar (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-5711) Removing memory limitation of the Namenode by persisting Block - Block location mappings to disk. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5711?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:all-tabpanel ] Rohan Pasalkar updated HDFS-5711: --------------------------------- Description:=20 This jira is to track changes to be made to remove HDFS name-node memory li= mitation to hold block - block location mappings. It is a known fact that the single Name-node architecture of HDFS has scala= bility limits. The HDFS federation project alleviates this problem by using= horizontal scaling. This helps increase the throughput of metadata operati= on and also the amount of data that can be stored in a Hadoop cluster. The Name-node stores all the filesystem metadata in memory (even in the fed= erated architecture), the Name-node design can be enhanced by persisting part of the metadata onto se= condary storage and retaining=20 the popular or recently accessed metadata information in main memory. This = design can benefit a HDFS deployment which doesn't use federation but needs to store a large number of files or = large number of blocks. Lin Xiao from Hortonworks attempted a similar project [1] in the Summer of 2013. They used LevelDB to persist the Namespa= ce information (i.e file and directory inode information). A patch with this change is yet to be submitted to code base. We also inten= d to use LevelDB to persist metadata, and plan to=20 provide a complete solution, by not just persisting the Namespace informat= ion but also the Blocks Map onto secondary storage.=20 We did implement the basic prototype which stores the block-block location = mapping metadata to the persistent key-value store i.e. levelDB. Prototype = also maintains the in-memory cache of the recently used block-block locatio= n mappings metadata.=20 References: [1] Lin Xiao, Hortonworks, Removing Name-node=E2=80=99s memory limitation, = http://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-na= menodes-memory-limitation was: This jira acts as an umbrella jira to track all the improvements we've done= recently to improve Namenode's performance, responsiveness, and hence scal= ability. Those improvements include: 1. Incremental block reports (HDFS-395) 2. BlockManager.reportDiff optimization for processing block reports (HDFS-= 2477) 3. Upgradable lock to allow simutaleous read operation while reportDiff is = in progress in processing block reports (HDFS-2490) 4. More CPU efficient data structure for under-replicated/over-replicated/i= nvalidate blocks (HDFS-2476) 5. Increase granularity of write operations in ReplicationMonitor thus redu= cing contention for write lock (HDFS-2495) 6. Support variable block sizes 7. Release RPC handlers while waiting for edit log is synced to disk 8. Reduce network traffic pressure to the master rack where NN is located b= y lowering read priority of the replicas on the rack 9. A standalone KeepAlive heartbeat thread 10. Reduce Multiple traversals of path directory to one for most namespace = manipulations 11. Move logging out of write lock section. > Removing memory limitation of the Namenode by persisting Block - Block lo= cation mappings to disk. > -------------------------------------------------------------------------= ------------------------ > > Key: HDFS-5711 > URL: https://issues.apache.org/jira/browse/HDFS-5711 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: Rohan Pasalkar > > This jira is to track changes to be made to remove HDFS name-node memory = limitation to hold block - block location mappings. > It is a known fact that the single Name-node architecture of HDFS has sca= lability limits. The HDFS federation project alleviates this problem by usi= ng horizontal scaling. This helps increase the throughput of metadata opera= tion and also the amount of data that can be stored in a Hadoop cluster. > The Name-node stores all the filesystem metadata in memory (even in the f= ederated architecture), the > Name-node design can be enhanced by persisting part of the metadata onto = secondary storage and retaining=20 > the popular or recently accessed metadata information in main memory. Thi= s design can benefit a HDFS deployment > which doesn't use federation but needs to store a large number of files o= r large number of blocks. Lin Xiao from Hortonworks attempted a similar > project [1] in the Summer of 2013. They used LevelDB to persist the Names= pace information (i.e file and directory inode information). > A patch with this change is yet to be submitted to code base. We also int= end to use LevelDB to persist metadata, and plan to=20 > provide a complete solution, by not just persisting the Namespace inform= ation but also the Blocks Map onto secondary storage.=20 > We did implement the basic prototype which stores the block-block locatio= n mapping metadata to the persistent key-value store i.e. levelDB. Prototyp= e also maintains the in-memory cache of the recently used block-block locat= ion mappings metadata.=20 > References: > [1] Lin Xiao, Hortonworks, Removing Name-node=E2=80=99s memory limitation= , http://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-= namenodes-memory-limitation -- This message was sent by Atlassian JIRA (v6.1.5#6160)