Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B40CDF74 for ; Sat, 17 Nov 2012 07:28:24 +0000 (UTC) Received: (qmail 51371 invoked by uid 500); 17 Nov 2012 07:28:19 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 51082 invoked by uid 500); 17 Nov 2012 07:28:18 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 51061 invoked by uid 99); 17 Nov 2012 07:28:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Nov 2012 07:28:18 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE X-Spam-Check-By: apache.org Received-SPF: unknown (athena.apache.org: error in processing during lookup of scott@richrelevance.com) Received: from [206.225.166.86] (HELO hub025-nj-3.exch025.serverdata.net) (206.225.166.86) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Nov 2012 07:28:11 +0000 Received: from MBX025-E1-NJ-8.exch025.domain.local ([10.240.12.58]) by HUB025-NJ-3.exch025.domain.local ([10.240.12.36]) with mapi id 14.02.0318.001; Fri, 16 Nov 2012 23:27:50 -0800 From: Scott Carey To: "user@hadoop.apache.org" Subject: Re: Optimizing Disk I/O - does HDFS do anything ? Thread-Topic: Optimizing Disk I/O - does HDFS do anything ? Thread-Index: AQHNwd29Kqj3FUwb2UGOcJlLqIUilpfoyOSAgATddYA= Date: Sat, 17 Nov 2012 07:27:49 +0000 Message-ID: <406071A75846C1449BAD4F5ED6315BEF1CB516F2@mbx025-e1-nj-8.exch025.domain.local> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.4.120824 x-originating-ip: [67.164.73.12] Content-Type: multipart/alternative; boundary="_000_406071A75846C1449BAD4F5ED6315BEF1CB516F2mbx025e1nj8exch_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_406071A75846C1449BAD4F5ED6315BEF1CB516F2mbx025e1nj8exch_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Ext3 can be quite atrocious when it comes to fragmentation. Simply start w= ith an empty drive, and have 8 threads each concurrently write to their own= large file sequentially. ext4 is much better in this regard. xfs is not as good at initial placement, but has an online defragmenter. ext4 is fastest on a clean system but eventually can get somewhat fragmente= d and has no defragmentation option. xfs is slow at meta-data operations and I would avoid it for M/R temp for t= hat reason. I use ext4 for M/R temp, and xfs + online defragmenter for HDFS. The defra= gmenter runs nightly and has little work to do if run regularly. On 11/13/12 1:10 PM, "Bertrand Dechoux" > wrote: People are welcome to complement but I guess the answer is : 1) Hadoop is not running on windows (I am not sure if Microsoft made any st= atement about the OS used for Hadoop on Azure.) -> http://www.howtogeek.com/115229/htg-explains-why-linux-doesnt-need-defra= gmenting/ 2) files are written in one go with big blocks. (And actually, the files fr= agmentation is not the only issue. The many small files 'issue' is -in the = end- a data fragmentation issue too and has an impact to read throughput.) Bertrand Dechoux On Tue, Nov 13, 2012 at 9:30 PM, Jay Vyas > wrote: How does HDFS deal with optimization of file streaming? Do data nodes have= any optimizations at the disk level for dealing with fragmented files? I = assume not, but just curious if this is at all in the works, or if there ar= e java-y ways of dealing with a long running set of files in an HDFS cluste= r. MAybe, for example, data nodes could log the amount of time spent on I/= O for certain files as a way of reporting wether or not defragmentation nee= ded to be run on a particular node in a cluster. -- Jay Vyas http://jayunit100.blogspot.com --_000_406071A75846C1449BAD4F5ED6315BEF1CB516F2mbx025e1nj8exch_ Content-Type: text/html; charset="us-ascii" Content-ID: <1DA6C84D1A7D6049B83AA01460D099A9@exch025.domain.local> Content-Transfer-Encoding: quoted-printable
Ext3 can be quite atrocious when it comes to fragmentation.  Simp= ly start with an empty drive, and have 8 threads each concurrently write to= their own large file sequentially.
ext4 is much better in this regard.
xfs is not as good at initial placement, but has an online defragmente= r.
ext4 is fastest on a clean system but eventually can get somewhat frag= mented and has no defragmentation option.
xfs is slow at meta-data operations and I would avoid it for M/R temp = for that reason.


I use ext4 for M/R temp, and xfs + online defragmenter for HDFS. &= nbsp;The defragmenter runs nightly and has little work to do if run regular= ly.



On 11/13/12 1:10 PM, "Bertrand Dechoux" <dechouxb@gmail.com> wrote:

People are welcome to complement but I guess the answer is :
1) Hadoop is not running on windows (I am not sure if Microsoft made any st= atement about the OS used for Hadoop on Azure.)
-> http://www.howtogeek.com/115229/htg-explains-why-linux-doesnt-need-defragme= nting/
2) files are written in one go with big blocks. (And actually, the files fr= agmentation is not the only issue. The many small files 'issue' is -in the = end- a data fragmentation issue too and has an impact to read throughput.)<= br>
Bertrand Dechoux

On Tue, Nov 13, 2012 at 9:30 PM, Jay Vyas <jayunit100@gm= ail.com> wrote:
How does HDFS deal with optimization of file streaming?  Do data nodes= have any optimizations at the disk level for dealing with fragmented files= ?  I assume not, but just curious if this is at all in the works, or i= f there are java-y ways of dealing with a long running set of files in an HDFS cluster.  MAybe, for example, da= ta nodes could log the amount of time spent on I/O for certain files as a w= ay of reporting wether or not defragmentation needed to be run on  a p= articular node in a cluster.   

--_000_406071A75846C1449BAD4F5ED6315BEF1CB516F2mbx025e1nj8exch_--