Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 732EE17585 for ; Mon, 27 Jul 2015 12:15:53 +0000 (UTC) Received: (qmail 49441 invoked by uid 500); 27 Jul 2015 12:15:47 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 49302 invoked by uid 500); 27 Jul 2015 12:15:47 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 49289 invoked by uid 99); 27 Jul 2015 12:15:47 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Jul 2015 12:15:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id A13C4190D3F for ; Mon, 27 Jul 2015 12:15:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.98 X-Spam-Level: *** X-Spam-Status: No, score=3.98 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id oLmqxD9V7P9r for ; Mon, 27 Jul 2015 12:15:42 +0000 (UTC) Received: from rpc7292.td.teradata.com (nat13.teradata.com [153.65.16.13]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id C788E4E1FF for ; Mon, 27 Jul 2015 12:15:41 +0000 (UTC) Received: from rpc7292.td.teradata.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8ACE42580AB for ; Mon, 27 Jul 2015 08:16:30 -0400 (EDT) Received: from SUSHDC8440.TD.TERADATA.COM (unknown [153.65.10.242]) by rpc7292.td.teradata.com (Postfix) with ESMTPS id 7E5DE25802F for ; Mon, 27 Jul 2015 08:16:30 -0400 (EDT) Received: from SUSHDC8000.TD.TERADATA.COM ([fe80::557c:3e42:2bd0:6e2b]) by SUSHDC8440.TD.TERADATA.COM ([::1]) with mapi id 14.03.0224.002; Mon, 27 Jul 2015 08:15:41 -0400 From: "Vishwakarma, Chhaya" To: "user@hadoop.apache.org" Subject: Hdfs put VS webhdfs Thread-Topic: Hdfs put VS webhdfs Thread-Index: AdDIZdZjol8Sb6bDQZ288BkZZu2UgA== Date: Mon, 27 Jul 2015 12:15:40 +0000 Message-ID: <9F9DE72B96464A44901B056CE36E33D8D20526@SUSHDC8000.TD.TERADATA.COM> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [153.65.231.223] X-TM-AS-Product-Ver: IMSVA-8.5.0.1618-8.0.0.1202-21706.007 X-TM-AS-Result: No--21.312-7.0-31-10 X-TM-AS-User-Approved-Sender: No x-tm-as-user-blocked-sender: No Content-Type: multipart/alternative; boundary="_000_9F9DE72B96464A44901B056CE36E33D8D20526SUSHDC8000TDTERAD_" MIME-Version: 1.0 X-TM-AS-MML: disable X-imss-scan-details: No--21.312-7.0-31-10 X-TMASE-Version: IMSVA-8.5.0.1618-8.0.1202-21706.007 X-TMASE-Result: 10--21.311900-5.000000 X-TMASE-MatchedRID: k7uZBIXJRH89VOcqFovwZGtGlKM15tA2QKuv8uQBDjrMB0kPsl40w/pZ jWQiqBlh8liA3v3ppOx3vebjIXgwlwwIsQZbtzZwjSOVeRIcbV6usS9CiBzL8SkuQHMVoy3tmGW LB5m2mpbqwK5lPpYEDUFWCvm86w840sXpjQvtH9Abrq24GKfh0Oq/uWzz2rh3+TdKNkxxkWSoBV 9sc+TYx+I+Ja+Fp+EYrBBSnn/GVMan+dP7GcjVHbqQyAveNtg60zEP/d7xPF1G2qlFbyxbImpHK tkQBynK49bJrkN9wqy/pnG/GWI+xAkR3SUb3gm7emu6h6JZhrmZEoWHC6Rh/QS/lfZQ9/PRP7gD LiEt/1fvsEhC8rfeWtVIxRq5LXM2lwV2iaAfSWcURSScn+QSXsidYBYDjITpVpmZ20BQauoqtq5 d3cxkNSV/mNujMD7NYIQUaYFlDm6O7W/Jo9PAu/1wZPrHqVJz5oUft7WczTY= --_000_9F9DE72B96464A44901B056CE36E33D8D20526SUSHDC8000TDTERAD_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, I'm loading 28 GB file in hadoop hdfs using webhdfs and it takes ~25 mins t= o load. I tried loading same file using hdfs put and It took ~6 mins. Why t= here is so much difference in performance? What is recommended to use? Can somebody explain or direct me to some good = link it will be really helpful. Below us the command I'm using curl -i --negotiate -u: -X PUT "http://$hostname:$port/webhdfs/v1/$destinat= ion_file_location/$source_filename.temp?op=3DCREATE&overwrite=3Dtrue" this will redirect to a datanode address which I use in next step to write = the data Regards, Chhaya --_000_9F9DE72B96464A44901B056CE36E33D8D20526SUSHDC8000TDTERAD_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

I'm loading 28 GB file in hadoop hdfs using we= bhdfs and it takes ~25 mins to load. I tried loading same file using hdfs p= ut and It took ~6 mins. Why there is so much difference in performance?

What is recommended to use? Can somebody expla= in or direct me to some good link it will be really helpful.

Below us the command I'm using

curl -i --negotiate -u: -X PUT "http://$= hostname:$port/webhdfs/v1/$destination_file_location/$source_filename.temp?= op=3DCREATE&overwrite=3Dtrue"

 

this will redirect to a datanode address which= I use in next step to write the data

 

Regards,

Chhaya

 

--_000_9F9DE72B96464A44901B056CE36E33D8D20526SUSHDC8000TDTERAD_--