Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 39014 invoked from network); 24 Sep 2009 03:40:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Sep 2009 03:40:40 -0000 Received: (qmail 39860 invoked by uid 500); 24 Sep 2009 03:40:40 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 39747 invoked by uid 500); 24 Sep 2009 03:40:39 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 39734 invoked by uid 99); 24 Sep 2009 03:40:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Sep 2009 03:40:38 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of anthony.urso@gmail.com designates 209.85.216.190 as permitted sender) Received: from [209.85.216.190] (HELO mail-px0-f190.google.com) (209.85.216.190) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Sep 2009 03:40:29 +0000 Received: by pxi28 with SMTP id 28so1156316pxi.2 for ; Wed, 23 Sep 2009 20:40:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=4WT/RmT3ANDW2uZt6uDnUuZKkL0Rg3LSMpH9dfNRZ5c=; b=uOaVUZhqR0q3EYdjbS2WV9WO5PggLmSBrvMVMJrAooElWO1xreZRTgROiHm1KT2QDi QOaaLCxGRXqsR5r3mwUrNJ4Rzx+jlRN543xVC7sAbh+eRVDOuzXl3etHc1Y0MKWyzb7/ HE/G0H8U8dwOS5XJKJkcjWFt5ZJWPL2JBVcz8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=QYJOckJpPvW3t6/Di3/m5EvlKxL5tgwhHVfwycDylKLbdnwcWNkJUIqQun18M/WuAr T9tRatdrlH9s4qTx8FQMsWvritkzBS1y7r0lUJrVTJMz9Q8YTFz9EKb29cGwbHO1AYwy huAAqgcot0TYccbg5qAjL7ISwi5BraZuU07uI= MIME-Version: 1.0 Received: by 10.140.157.4 with SMTP id f4mr190852rve.76.1253763608133; Wed, 23 Sep 2009 20:40:08 -0700 (PDT) In-Reply-To: References: Date: Wed, 23 Sep 2009 20:40:08 -0700 Message-ID: Subject: Re: HDFS single node cluster vs. NTFS performance comparison From: Anthony Urso To: hdfs-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org The Annals of Improbable Research may be interested. I believe they recently published a study comparing apples and oranges. Cheers, Anthony On Wed, Sep 23, 2009 at 8:11 AM, HarishKashyap TS wrote: > Hi All, > > > > I have completed a performance testing activity of HDFS single node vs. NTFS > file systems. Modified versions of SLG tools provided by Hadoop has been > utilized for this activity. Under similar environment conditions, > performance of the two file systems has been compared across various file > operations. > > From our tests, statistics related to the amount of overhead introduced by > HDFS can be obtained. > > For E.g. If number of file created is considers as a metric, then, local > file system (NTFS) performs 30% better when compared to HDFS. > > > > We are planning to publish an article on this. Suggestions about the > technical forums, where the publication of this article would be > appropriate, will be of great help. > > > > Aaron, > > Thanks a lot for your inputs and time. > > > > Regards, > > Harish Kashyap > > > > From: Aaron Kimball [mailto:aaron@cloudera.com] > Sent: Wednesday, September 23, 2009 1:49 AM > To: hdfs-user@hadoop.apache.org > Subject: Re: HDFS single node cluster vs. NTFS performance comparison > > > > To my knowledge, nobody's benchmarked this in a rigorous fashion. It's > virtually certain, though, that on the same machine, NTFS would perform > faster. HDFS does not directly write to the disk driver, it uses the local > filesystem of the node on which it's installed. So any HDFS writes would > themselves be channeled through NTFS and then down to the disk. The read > path, of course, would go through NTFS first and then via HDFS out to the > client. > > So, HDFS can only add overhead. How much overhead is probably not a > published number. > > - Aaron > > On Tue, Sep 22, 2009 at 7:25 AM, HarishKashyap TS > wrote: > > Hi All, > > > > Has performance testing and comparison of HDFS single node cluster vs. NTFS > file systems been performed? Any sample results of HDFS single node vs. NTFS > performance comparison available? > > > > Your input/feedback regarding this would be very helpful. > > > > Regards, > > Harish Kashyap > > > >