Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 28929 invoked from network); 3 Aug 2009 08:04:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Aug 2009 08:04:03 -0000 Received: (qmail 43031 invoked by uid 500); 3 Aug 2009 08:04:08 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 42959 invoked by uid 500); 3 Aug 2009 08:04:07 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 42950 invoked by uid 99); 3 Aug 2009 08:04:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Aug 2009 08:04:07 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of haogong@huawei.com designates 119.145.14.67 as permitted sender) Received: from [119.145.14.67] (HELO szxga04-in.huawei.com) (119.145.14.67) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Aug 2009 08:03:59 +0000 Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0KNS00GVDJJWW9@szxga04-in.huawei.com> for hdfs-user@hadoop.apache.org; Mon, 03 Aug 2009 15:59:57 +0800 (CST) Received: from huawei.com ([172.24.1.33]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0KNS00JW7JJW27@szxga04-in.huawei.com> for hdfs-user@hadoop.apache.org; Mon, 03 Aug 2009 15:59:56 +0800 (CST) Received: from g00100211 ([10.70.142.181]) by szxml06-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0KNS00KN4JJWWG@szxml06-in.huawei.com> for hdfs-user@hadoop.apache.org; Mon, 03 Aug 2009 15:59:56 +0800 (CST) Date: Mon, 03 Aug 2009 15:59:55 +0800 From: Hao Gong Subject: why did I achieve such poor performance of HDFS To: hdfs-user@hadoop.apache.org Message-id: <001101ca1410$634ad000$b58e460a@china.huawei.com> MIME-version: 1.0 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.3350 X-Mailer: Microsoft Office Outlook 11 Content-type: multipart/alternative; boundary="Boundary_(ID_TwqoMvLmnXTKqlv/I+aLsw)" Thread-index: AcoUEGMcnkd+rcviTburn1A9xOWkow== X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --Boundary_(ID_TwqoMvLmnXTKqlv/I+aLsw) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi all, I have used HDFS as distributed storage system for experiment. But in my test process, I find that the performance of HDFS is very poor. I make two scenarios. 1) Middle size file test: I PUT 200,000 middle size files (20KB~20MB randomly) into HDFS, and trigger 10 client to GET random 5000 files simultaneously. But the average GET throughput of client is very poor (approximately less than 14000 KBps). 2) Large size file test. I PUT 20,000 large size files (250MB~750MB randomly) into HDFS, and trigger 10 client to GET random 100 files simultaneously. But the average GET throughput of client is also very poor (approximately less than 12500 KBps). So I'm puzzle about these experiments, why did such a poor performance of HDFS, the available throughput of Client is far less than the limit of network bandwidth. Is that has any parameter I need to change for high performance in HDFS (I chose default parameter value)? My enviroment is list as follows 1) 30 common PC as HDFS slaves (core2 E7200, 4G ram, 1.5T hdd) 2) 10 common PC as HDFS clients (core2 E7200, 4G ram, 1.5T hdd) 3) A common PC as HDFS master (core2 E7200, 4G ram, 1.5T hdd) 4) 1000M switcher and link as star network architecture 5) The hadoop version is 0.20.0, JRE version is 1.6.0_11 Is there has anybody to research the performance of HDFS, please contact me. Thank you very much. Best regards, Hao Gong Huawei Technologies Co., Ltd *********************************************** This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! *********************************************** --Boundary_(ID_TwqoMvLmnXTKqlv/I+aLsw) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi = all,

I have used HDFS as distributed = storage system for experiment. But in my test process, I find that the = performance of HDFS is very poor.

I make two scenarios. 1) Middle = size file test: I PUT 200,000 middle size files (20KB~20MB randomly) into = HDFS, and trigger 10 client to GET random 5000 files simultaneously. But the = average GET throughput of client is very poor (approximately less than 14000 KBps). = 2) Large size file test. I PUT 20,000 large size files (250MB~750MB = randomly) into HDFS, and trigger 10 client to GET random 100 files simultaneously. But = the average GET throughput of client is also very poor (approximately less = than 12500 KBps).

So I’m puzzle about these = experiments, why did such a poor performance of HDFS, the available throughput of = Client is far less than the limit of network bandwidth. Is that has any parameter = I need to change for high performance in HDFS (I chose default parameter = value)?

My enviroment is list as = follows

1) 30 common PC as HDFS slaves = (core2 E7200, 4G ram, 1.5T = hdd)

2) 10 common PC as HDFS clients = (core2 E7200, 4G ram, 1.5T = hdd)

3) A common PC as HDFS master = (core2 E7200, 4G ram, 1.5T = hdd)

4) 1000M switcher and link as star network = architecture

5) The hadoop version is = 0.20.0, JRE version is 1.6.0_11

Is there has anybody to research = the performance of HDFS, please contact me. Thank you very = much.

 

Best regards,

Hao = Gong

Huawei Technologies Co., Ltd
***********************************************
This e-mail and its attachments contain confidential information from = HUAWEI, which is intended only for the person or entity whose address is listed = above. Any use of the information contained herein in any way (including, but = not limited to, total or partial disclosure, reproduction, or dissemination) = by persons other than the intended recipient(s) is prohibited. If you = receive this e-mail in error, please notify the sender by phone or email immediately = and delete it!
***********************************************

 

--Boundary_(ID_TwqoMvLmnXTKqlv/I+aLsw)--