Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F7B5172B3 for ; Wed, 25 Mar 2015 21:21:19 +0000 (UTC) Received: (qmail 9614 invoked by uid 500); 25 Mar 2015 21:21:12 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 9490 invoked by uid 500); 25 Mar 2015 21:21:12 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 9478 invoked by uid 99); 25 Mar 2015 21:21:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Mar 2015 21:21:12 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.213.182 as permitted sender) Received: from [209.85.213.182] (HELO mail-ig0-f182.google.com) (209.85.213.182) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Mar 2015 21:20:46 +0000 Received: by ignm3 with SMTP id m3so2238224ign.0 for ; Wed, 25 Mar 2015 14:19:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=OCqdwzl2fK9/oiZQKGouak5OiRFN+e7P0W9egVG60XQ=; b=eJxrEqMBph1afiRe5jCe2sMsH4+lrP4NGl+fB9K2NqjkJX2PILqtocg8TpHcO2g9qu AE71PZgzh24v+mrs1WNOl+EuhQmYnpYg6C3UZ3pV69QiimbYYOQ6VCnW06pScz95SY0H mr0ssxizv4gCvPLNZYbeQ1btBB5cQFHQmlZ4F4aW81fKfv/Tdsbi5H+H2JNbGCnHlIWZ G+NbUyPgwG1XAeCCn6VfXLaQyRJCJZR4kwtpFm0LeXMt3fBMTZJ2DRm1KD1dyieeI9i0 OK3rUEOK3UikPr7I6WjExCnvmz0wwPgD6M/aHOXSdciS2T/OBwUNIjbZC06YheC1vqV4 6DaQ== X-Gm-Message-State: ALoCoQkEZ7RcULYxxWYm/ONKBt5AHyeMkq3mNUNb+C7002vwgltY78RGxLFevxTA7j2NwALVGoT/ X-Received: by 10.107.18.38 with SMTP id a38mr16735244ioj.67.1427318399727; Wed, 25 Mar 2015 14:19:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.9.144 with HTTP; Wed, 25 Mar 2015 14:19:38 -0700 (PDT) In-Reply-To: <03d301d06738$c8103800$5830a800$@co.uk> References: <537161234-1427297694-cardhu_decombobulator_blackberry.rim.net-1445247541-@b14.c3.bise7.blackberry> <127321891-1427303771-cardhu_decombobulator_blackberry.rim.net-1674372454-@b14.c3.bise7.blackberry> <03d301d06738$c8103800$5830a800$@co.uk> From: Harsh J Date: Thu, 26 Mar 2015 02:49:38 +0530 Message-ID: Subject: Re: Can block size for namenode be different from wdatanode block size? To: "" Content-Type: multipart/alternative; boundary=001a113fe98c3295a3051223762f X-Virus-Checked: Checked by ClamAV on apache.org --001a113fe98c3295a3051223762f Content-Type: text/plain; charset=UTF-8 > 2. The block size is only relevant to DataNodes (DN). NameNode (NN) does not use this parameter Actually, as a configuration, its only relevant to the client. See also http://www.quora.com/How-do-I-check-HDFS-blocksize-default-custom Other points sound about right, except the ability to do (7) can only now be done if you have legacy mode of fsimage writes enabled. The new OIV tool in recent releases only serves a REST based Web Server to query the file data upon. On Thu, Mar 26, 2015 at 1:47 AM, Mich Talebzadeh wrote: > Thank you all for your contribution. > > > > I have summarised the findings as below > > > > 1. The Hadoop block size is a configurable parameter dfs.block.size > in bytes . By default this is set to 134217728 bytes or 128MB > > 2. The block size is only relevant to DataNodes (DN). NameNode (NN) > does not use this parameter > > 3. NN behaves like an in-memory database IMDB and uses a disk file > system called the FsImage to load the metadata as startup. This is the only > place that I see value for Solid State Disk to make this initial load faster > > 4. For the remaining period until HDFS shutdown or otherwise NN will > use the in memory cache to access metadata > > 5. With regard to sizing of NN to store metadata, one can use the > following rules of thumb (heuristics): > > a. NN consumes roughly 1GB for every 1 million blokes (source Hadoop > Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB > block size, you can store 128 * 1E6 / (3 *1024) = 41,666GB of data for > every 1GB. Number 3 comes from the fact that the block is replicated three > times. In other words just under 42TB of data. So if you have 10GB of > namenode cache, you can have up to 420TB of data on your datanodes > > 6. You can take FsImage file from Hadoop and convert it into a text > file as follows: > > > > *hdfs dfsadmin -fetchImage nnimage* > > > > 15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > > 15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to > http://rhes564:50070/imagetransfer?getimage=1&txid=latest > > 15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout > configured to 60000 milliseconds > > 15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file > nnimage with file downloaded from > http://rhes564:50070/imagetransfer?getimage=1&txid=latest > > 15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at > 1393.94 KB/s > > > > 7. That create an image file in the current directory that can be > converted to text file > > *hdfs oiv -i nnimage -o nnimage.txt* > > > > 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings > > 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543 > inodes. > > 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode > references > > 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode > references > > 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode > directory section > > 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198 > directories > > 15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer > started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer. > > > > Let me know if I missed anything or got it wrong. > > > > HTH > > > > Mich Talebzadeh > > > > http://talebzadehmich.wordpress.com > > > > *Publications due shortly:* > > *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and > Coherence Cache* > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Ltd, its > subsidiaries or their employees, unless expressly so stated. It is the > responsibility of the recipient to ensure that this email is virus free, > therefore neither Peridale Ltd, its subsidiaries nor their employees accept > any responsibility. > > > -- Harsh J --001a113fe98c3295a3051223762f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
>=C2=A02.=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0The block size is only relevant to DataNodes (DN= ). NameNode (NN) does not use this parameter

Actually, as a configurati= on, its only relevant to the client. See also=C2=A0http://www.= quora.com/How-do-I-check-HDFS-blocksize-default-custom

Other points sound about right, = except the ability to do (7) can only now be done if you have legacy mode o= f fsimage writes enabled. The new OIV tool in recent releases only serves a= REST based Web Server to query the file data upon.

On Thu, Mar 26,= 2015 at 1:47 AM, Mich Talebzadeh <mich@peridale.co.uk> wr= ote:

Thank you all for you= r contribution.

= =C2=A0

I have sum= marised the findings as below

=C2=A0

1.=C2=A0=C2=A0=C2=A0=C2= =A0 The Hadoop block size is a confi= gurable parameter dfs.block.size in bytes . By default this is set to 13421= 7728 bytes or 128MB

2.=C2=A0=C2=A0=C2=A0=C2= =A0 The block size is only relevant = to DataNodes (DN). NameNode (NN) does not use this parameter<= /span>

3.=C2=A0=C2=A0=C2=A0=C2=A0 NN behaves like an in-memory database IMDB and uses a disk file sys= tem called the FsImage to load the metadata as startup. This is the only pl= ace that I see value for Solid State Disk to make this initial load faster<= u>

4.=C2=A0=C2=A0=C2=A0=C2=A0 For the remaining period until HDFS shutdown or other= wise NN will use the in memory cache to access metadata

5.=C2=A0=C2=A0=C2=A0=C2=A0 With regard to sizing of NN to store metadata, one can use the following= rules of thumb (heuristics):

a.=C2=A0=C2=A0=C2=A0=C2=A0 NN consumes roughly 1GB for every 1 million blokes (source Hadoop Ope= rations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB block s= ize, you can store =C2=A0128 * 1E6 / (3 *1024) =3D 41,666GB of data for eve= ry 1GB. Number 3 comes from the fact that the block is replicated three tim= es. In other words just under 42TB of data. So if you have 10GB of namenode= cache, you can have up to 420TB of data on your datanodes

6.=C2=A0=C2=A0=C2=A0=C2=A0 You can take FsImage file from Hadoop and convert it into a text file= as follows:

=C2=A0

hdfs dfsadmin -fe= tchImage nnimage

=C2=A0

15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop= library for your platform... using builtin-java classes where applicable

= 15/03/25 20:17:41 INFO namenode.TransferFsImage: Open= ing connection to http://rhes564:50070/imagetransfer?g= etimage=3D1&txid=3Dlatest

15/03/25 20:17:41 = INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 m= illiseconds

15/03/25 20:17:41 WARN namenode.Transfer= FsImage: Overwriting existing file nnimage with file downloaded from http://rhes564:50070/imagetransfer?getimage=3D1&txid=3Dl= atest

15/03/25 20:17:41 INFO namenode.TransferFs= Image: Transfer took 0.03s at 1393.94 KB/s

=C2=A0

7.= =C2=A0=C2=A0=C2=A0=C2=A0 That create= an image file in the current directory that can be converted to text file<= u>

hdfs=C2=A0 oiv -i nnimage -o nnimage.txt<= /p>

=C2=A0

15= /03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings

= 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543 = inodes.

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loa= ding inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSIm= ageHandler: Loaded 0 inode references

15/03/25 20:20:07 INFO offlineI= mageViewer.FSImageHandler: Loading inode directory section

15/03/25 2= 0:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198 directories

= 127.0.0.1:5978. Press Ctrl+C to stop the viewer.

=C2=A0

Let me know if I missed=C2=A0 anything or got = it wrong.

= =C2=A0

HTH=

=C2=A0

Mich Talebzadeh

=C2=A0<= /u>

http://talebzadehmich.wordpress.com

=C2=A0

Publications due shortl= y:

Creating in-memory Data Grid for Trading = Systems with Oracle TimesTen and Coherence Cache

NOTE: The information in this email is proprietary and confidential. Thi= s message is for the designated recipient only, if you are not the intended= recipient, you should destroy it immediately. Any information in this mess= age shall not be understood as given or endorsed by Peridale Ltd, its subsi= diaries or their employees, unless expressly so stated. It is the responsib= ility of the recipient to ensure that this email is virus free, therefore n= either Peridale Ltd, its subsidiaries nor their employees accept any respon= sibility.

=C2=A0




--
Harsh J
--001a113fe98c3295a3051223762f--