Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 84282 invoked from network); 14 May 2010 08:24:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 May 2010 08:24:57 -0000 Received: (qmail 90339 invoked by uid 500); 14 May 2010 08:24:57 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 90227 invoked by uid 500); 14 May 2010 08:24:55 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 90219 invoked by uid 99); 14 May 2010 08:24:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 May 2010 08:24:55 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [202.144.10.51] (HELO orcon.sify.net) (202.144.10.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 May 2010 08:24:45 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by orcon.sify.net (Postfix) with ESMTP id 97D3933A688 for ; Fri, 14 May 2010 13:54:24 +0530 (IST) X-Virus-Scanned: amavisd-new at example.com Received: from orcon.sify.net ([127.0.0.1]) by localhost (orcon.sify.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uDT9kVyYgYN7 for ; Fri, 14 May 2010 13:54:24 +0530 (IST) Received: from mail1.mkhoj.com (mx.mkhoj.com [210.210.41.205]) by orcon.sify.net (Postfix) with ESMTP id 3C4CF33A685 for ; Fri, 14 May 2010 13:54:24 +0530 (IST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAFKk7EsKDmQH/2dsb2JhbACec7w9hRAEgz4 X-IronPort-AV: E=Sophos;i="4.53,227,1272825000"; d="scan'208";a="675433" Received: from mk-exch-1.mkhoj.com ([10.14.100.7]) by mail1.mkhoj.com with ESMTP; 14 May 2010 13:54:55 +0530 Received: from [10.14.100.80] (10.14.100.80) by MK-EXCH-1.MKHOJ.COM (10.14.100.7) with Microsoft SMTP Server id 8.1.340.0; Fri, 14 May 2010 13:47:53 +0530 Message-ID: <4BED086E.7010906@inmobi.com> Date: Fri, 14 May 2010 13:53:10 +0530 From: Rohan Rai User-Agent: Thunderbird 2.0.0.24 (X11/20100411) MIME-Version: 1.0 To: Subject: HDFS Read ThroughPut and DISK Read ThroughPut Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi Is there a relationship between HDFS Read throught put and Disk Read throughput. If yes what would be that. Lets say we have a disk giving us 120 MB/s And a Cluster of 6 Nodes Each Node having 6 disk. So in an absolutely ideal world it should give us a through put of 120*6*6 MB/s if used in parallel In a non ideal world we can divide above by a factor of x Then why is that the general CLUSTER read throughput is so very less. Generally it hovers around 90MB/s. How is the throughput which cluster provides is accounted for. Just for information, configs are , 8 GB RAM, 250 GB HDD, 8 Maps per node, 128 Kb Block size Regards Rohan The information contained in this communication is intended solely for the = use of the individual or entity to whom it is addressed and others authoriz= ed to receive it. It may contain confidential or legally privileged informa= tion. If you are not the intended recipient you are hereby notified that an= y disclosure, copying, distribution or taking any action in reliance on the= contents of this information is strictly prohibited and may be unlawful. I= f you have received this communication in error, please notify us immediate= ly by responding to this email and then delete it from your system. The fir= m is neither liable for the proper and complete transmission of the informa= tion contained in this communication nor for any delay in its receipt.