Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 460E0D47B for ; Thu, 10 Jan 2013 23:56:35 +0000 (UTC) Received: (qmail 87191 invoked by uid 500); 10 Jan 2013 23:56:30 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 86906 invoked by uid 500); 10 Jan 2013 23:56:29 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 86897 invoked by uid 99); 10 Jan 2013 23:56:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jan 2013 23:56:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of arpit@hortonworks.com designates 209.85.220.51 as permitted sender) Received: from [209.85.220.51] (HELO mail-pa0-f51.google.com) (209.85.220.51) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jan 2013 23:56:22 +0000 Received: by mail-pa0-f51.google.com with SMTP id fb11so649284pad.38 for ; Thu, 10 Jan 2013 15:56:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:content-type:message-id:mime-version:subject:date :references:to:in-reply-to:x-mailer:x-gm-message-state; bh=OvsO1O2TA89i7YdTIcz2lZZ5cn+PeQYpJoOKNnWIQpk=; b=iRr4gJtHdiFjntScGesZ8ZaNTcVxk4wrT6moliRV0OpndYV6P80JlPcbrH0RKt3iIj c+aCOHKUmeof77ze5Z2HMeQL3eHgfBS9vRgdUrLhfNe4dFHx2P8+Z63OpnLygsw6M0bu VcvyTkvMLt3FuZ+BUXwM9C2YjSJxjvFUWhWpei9H/qHbTqtL8r1GdQXODg4i4JIB34+w Hb05vyjrMbCI9m3LAG3RLanMH54jPqLGMFMBJthfWnbx3yHAAQlSeaQrBSfXKleEp/N4 l9rXF39FodfFGOwFodc8vcR4xKj7kD/rcQbltQWVDwYZEIziq9uZNEREYAKSBB4eYmKu lGFw== X-Received: by 10.68.143.100 with SMTP id sd4mr214431752pbb.107.1357862160813; Thu, 10 Jan 2013 15:56:00 -0800 (PST) Received: from [10.11.2.239] (host1.hortonworks.com. [70.35.59.2]) by mx.google.com with ESMTPS id o5sm1881694pay.5.2013.01.10.15.55.59 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 10 Jan 2013 15:55:59 -0800 (PST) From: Arpit Gupta Content-Type: multipart/alternative; boundary="Apple-Mail=_2FBB8253-9ADC-47D3-96B9-77B86CA84EF1" Message-Id: <00312B0A-10DF-409C-BFC4-72FAE6433E31@hortonworks.com> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: WEBHDFS API GETCONTENTSUMMARY issue Date: Thu, 10 Jan 2013 15:55:58 -0800 References: <54A8EF336FC15D4F946B9FE253F80F823F2D140C@RISBCTMBXP002.risk.regn.net> To: user@hadoop.apache.org In-Reply-To: <54A8EF336FC15D4F946B9FE253F80F823F2D140C@RISBCTMBXP002.risk.regn.net> X-Mailer: Apple Mail (2.1499) X-Gm-Message-State: ALoCoQmegeZjN/Ws4EROs8U6ZW4gIuIO4otUfHD3e65m9ZRT8RPJ5CxJcgTnvBhyWt6f2hVQVI00 X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_2FBB8253-9ADC-47D3-96B9-77B86CA84EF1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 Rodrigo GETCONTENTSUMMARY will return the summary of everything under the path = you specified, even the subdirectories. So i would suggest take a look = in the directories and see what content they have and then the numbers = should add up. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Jan 10, 2013, at 6:59 AM, "Pastrana, Rodrigo (RIS-BCT)" = wrote: > I=92m using WEBHDFS to query directory/file information, but the = GETCONTENTSUMMARY counts aren=92t returning expected counts. > =20 > For example, when I query content summary for the directory = /user/hadoop/tutorial, webhdfs returns the following: > =20 > x.y.z.w:50070/webhdfs/v1/user/hadoop/tutorial/?op=3DGETCONTENTSUMMARY > = {"ContentSummary":{"directoryCount":4,"fileCount":10,"length":3204490622,"= quota":-1,"spaceConsumed":3204490622,"spaceQuota":-1}}. > =20 > But looking at the content of that dir through the web portal, I see 6 = files and 3 subdirs: > =20 > accounts > file > 812.93 MB > 1 > 64 MB > 2012-12-12 10:59 > rw-r--r-- > hadoop > supergroup > accounts2 > file > 812.93 MB > 1 > 64 MB > 2012-07-31 16:48 > rw-r--r-- > hadoop > supergroup > accounts2-parts > dir > 2012-08-17 15:34 > rwxr-xr-x > hadoop > supergroup > persons > file > 124.38 MB > 1 > 64 MB > 2012-07-11 13:19 > rw-r--r-- > hadoop > supergroup > persons-parts > dir > 2012-12-12 10:35 > rwxr-xr-x > hadoop > supergroup > persons2 > file > 124.38 MB > 1 > 64 MB > 2012-07-20 13:53 > rw-r--r-- > hadoop > supergroup > persons3 > file > 124.38 MB > 1 > 64 MB > 2012-12-12 10:53 > rw-r--r-- > hadoop > supergroup > short-accounts > file > 59.88 MB > 1 > 64 MB > 2012-12-06 12:26 > rw-r--r-- > hadoop > supergroup > short-accounts-parts > dir > 2012-12-06 15:00 > rwxr-xr-x > hadoop > supergroup > =20 > Can anybody help make sense of the summary numbers? > =20 > =20 > Thanks, Rodrigo. > =20 >=20 >=20 > The information contained in this e-mail message is intended only for = the personal and confidential use of the recipient(s) named above. This = message may be an attorney-client communication and/or work product and = as such is privileged and confidential. If the reader of this message is = not the intended recipient or an agent responsible for delivering it to = the intended recipient, you are hereby notified that you have received = this document in error and that any review, dissemination, distribution, = or copying of this message is strictly prohibited. If you have received = this communication in error, please notify us immediately by e-mail, and = delete the original message. >=20 --Apple-Mail=_2FBB8253-9ADC-47D3-96B9-77B86CA84EF1 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/
<= /span>

On Jan 10, 2013, at 6:59 AM, "Pastrana, Rodrigo (RIS-BCT)" = <Rodrigo.Pastrana@lexisnexi= s.com> wrote:

I=92m using WEBHDFS to = query directory/file information, but the GETCONTENTSUMMARY counts = aren=92t returning expected counts.
 
For example, when I = query content summary for the directory /user/hadoop/tutorial, webhdfs = returns the following:
 
But = looking at the content of that dir through the web portal, I see 6 files = and 3 subdirs:
 
812.93 = MB
64 = MB
64 = MB
124.38 = MB
64 = MB
124.38 = MB
64 = MB
124.38 = MB
64 = MB
64 = MB
 
Can anybody help make = sense of the summary numbers?
 
 
Thanks, Rodrigo.


The information = contained in this e-mail message is intended only for the personal and = confidential use of the recipient(s) named above. This message may be an = attorney-client communication and/or work product and as such is = privileged and confidential. If the reader of this message is not the = intended recipient or an agent responsible for delivering it to the = intended recipient, you are hereby notified that you have received this = document in error and that any review, dissemination, distribution, or = copying of this message is strictly prohibited. If you have received = this communication in error, please notify us immediately by e-mail, and = delete the original = message.


= --Apple-Mail=_2FBB8253-9ADC-47D3-96B9-77B86CA84EF1--