Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7A5F1186AA for ; Tue, 22 Dec 2015 02:21:43 +0000 (UTC) Received: (qmail 79457 invoked by uid 500); 22 Dec 2015 02:21:32 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 79043 invoked by uid 500); 22 Dec 2015 02:21:31 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 79032 invoked by uid 99); 22 Dec 2015 02:21:31 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2015 02:21:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id AB7D3180455 for ; Tue, 22 Dec 2015 02:21:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.007 X-Spam-Level: X-Spam-Status: No, score=0.007 tagged_above=-999 required=6.31 tests=[RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_FILL_THIS_FORM_SHORT=0.01] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Nhwjpj4f0dXy for ; Tue, 22 Dec 2015 02:21:29 +0000 (UTC) Received: from na01-bl2-obe.outbound.protection.outlook.com (mail-bl2on0136.outbound.protection.outlook.com [65.55.169.136]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 1E3E820CB7 for ; Tue, 22 Dec 2015 02:21:27 +0000 (UTC) Received: from DM2PR0501MB1647.namprd05.prod.outlook.com (10.160.136.23) by DM2PR0501MB1645.namprd05.prod.outlook.com (10.160.135.28) with Microsoft SMTP Server (TLS) id 15.1.361.13; Tue, 22 Dec 2015 02:21:25 +0000 Received: from DM2PR0501MB1647.namprd05.prod.outlook.com ([10.160.136.23]) by DM2PR0501MB1647.namprd05.prod.outlook.com ([10.160.136.23]) with mapi id 15.01.0361.006; Tue, 22 Dec 2015 02:21:25 +0000 From: Martin Serrano To: "user@hadoop.apache.org" Subject: diagnosing the difference between dfs 'du' and 'df' Thread-Topic: diagnosing the difference between dfs 'du' and 'df' Thread-Index: AQHRPF9161gbVVsZeEynlYF1dRh3DA== Date: Tue, 22 Dec 2015 02:21:25 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=martin@attivio.com; x-originating-ip: [71.126.246.94] x-microsoft-exchange-diagnostics: 1;DM2PR0501MB1645;5:gEE4DOnQ6iVpra7IEI48+ZQNGUMIsXzcc7BVs4Vrb0GujBHAmfhrGx8B0hCrB6PrmGekspxs41zWeT988SQLHgwMpE2pNiOtCkRSt8y107gTpFKxtCvg+YRe2vXHueEd/FSnYgvtLNJBL+73c4kErQ==;24:jBeH3n9etwZKUfxvA2p54zm6qzKjwdIk8y50HL92mDrq2n+i0zL8I3GSvT98d7WTqG5Q7WGr9LC+OtXq+oYFzSYa3krtBqmCvhyfL+izr4c= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DM2PR0501MB1645; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(3002001)(10201501046);SRVR:DM2PR0501MB1645;BCL:0;PCL:0;RULEID:;SRVR:DM2PR0501MB1645; x-forefront-prvs: 0798146F16 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(189002)(199003)(164054003)(102836003)(122556002)(1220700001)(77096005)(5008740100001)(1096002)(450100001)(3846002)(586003)(2501003)(6116002)(11100500001)(101416001)(54356999)(76576001)(74316001)(5003600100002)(5004730100002)(92566002)(66066001)(2900100001)(229853001)(105586002)(40100003)(2351001)(81156007)(106116001)(50986999)(106356001)(33656002)(86362001)(107886002)(99286002)(87936001)(110136002)(97736004)(10400500002)(5001960100002)(5002640100001)(189998001);DIR:OUT;SFP:1102;SCL:1;SRVR:DM2PR0501MB1645;H:DM2PR0501MB1647.namprd05.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; received-spf: None (protection.outlook.com: attivio.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: attivio.com X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Dec 2015 02:21:25.6215 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f2451f17-698f-4d48-b34a-740e2ecaa55e X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0501MB1645 Hi,=0A= =0A= I have an application that is writing data rapidly directly to HDFS=0A= (creates and appends) as well as to HBase (10-15 tables). The disk free=0A= for the filesystem will report that a large percentage of the system is=0A= in use:=0A= =0A= $ hdfs dfs -df -h /=0A= Filesystem Size Used Available Use%=0A= hdfs://ha 882.6 G 472.6 G 409.9 G 54%=0A= =0A= Yet when I try to figure out where the disk space is being used using=0A= dfs -du reports:=0A= =0A= $ hdfs dfs -du -h /=0A= 0 /app-logs=0A= 7.6 G /apps=0A= 382.2 M /hdp=0A= 0 /mapred=0A= 0 /mr-history=0A= 8.5 K /tmp=0A= 3.8 G /user=0A= =0A= A dfsadmin -report during the same time frame is below. I'm trying to=0A= figure out where all of this space is going to. When my application is=0A= killed or quiescent, the df and dfsadmin reports fall in line with what=0A= I would expect. I'm running HDP 2.3 with a default configuration as set=0A= up by Ambari. I'm looking for hints or suggestions on how I can=0A= investigate this issue. It seems crazy that ingesting 12g or so of data=0A= can temporarily consume (reserve?) ~300g of HDFS.=0A= =0A= Thanks,=0A= Martin=0A= =0A= Configured Capacity: 947644268544 (882.56 GB)=0A= Present Capacity: 947064596261 (882.02 GB)=0A= DFS Remaining: 490046627240 (456.39 GB)=0A= DFS Used: 457017969021 (425.63 GB)=0A= DFS Used%: 48.26%=0A= Under replicated blocks: 0=0A= Blocks with corrupt replicas: 0=0A= Missing blocks: 0=0A= Missing blocks (with replication factor 1): 0=0A= =0A= -------------------------------------------------=0A= Live datanodes (3):=0A= =0A= Name: *.*.*.*:50010 (**********.com)=0A= Hostname: **********.com=0A= Decommission Status : Normal=0A= Configured Capacity: 315881422848 (294.19 GB)=0A= DFS Used: 218955099179 (203.92 GB)=0A= Non DFS Used: 168255175 (160.46 MB)=0A= DFS Remaining: 96758068494 (90.11 GB)=0A= DFS Used%: 69.32%=0A= DFS Remaining%: 30.63%=0A= Configured Cache Capacity: 0 (0 B)=0A= Cache Used: 0 (0 B)=0A= Cache Remaining: 0 (0 B)=0A= Cache Used%: 100.00%=0A= Cache Remaining%: 0.00%=0A= Xceivers: 15=0A= Last contact: Mon Dec 21 17:17:38 EST 2015=0A= =0A= =0A= Name: *.*.*.*:50010 (**********.com)=0A= Hostname: **********.com=0A= Decommission Status : Normal=0A= Configured Capacity: 315881422848 (294.19 GB)=0A= DFS Used: 218873337575 (203.84 GB)=0A= Non DFS Used: 151608508 (144.59 MB)=0A= DFS Remaining: 96856476765 (90.20 GB)=0A= DFS Used%: 69.29%=0A= DFS Remaining%: 30.66%=0A= Configured Cache Capacity: 0 (0 B)=0A= Cache Used: 0 (0 B)=0A= Cache Remaining: 0 (0 B)=0A= Cache Used%: 100.00%=0A= Cache Remaining%: 0.00%=0A= Xceivers: 16=0A= Last contact: Mon Dec 21 17:17:38 EST 2015=0A= =0A= =0A= Name: *.*.*.*:50010 (*************.com)=0A= Hostname: ***********.com=0A= Decommission Status : Normal=0A= Configured Capacity: 315881422848 (294.19 GB)=0A= DFS Used: 19189532267 (17.87 GB)=0A= Non DFS Used: 259808600 (247.77 MB)=0A= DFS Remaining: 296432081981 (276.07 GB)=0A= DFS Used%: 6.07%=0A= DFS Remaining%: 93.84%=0A= Configured Cache Capacity: 0 (0 B)=0A= Cache Used: 0 (0 B)=0A= Cache Remaining: 0 (0 B)=0A= Cache Used%: 100.00%=0A= Cache Remaining%: 0.00%=0A= Xceivers: 16=0A= Last contact: Mon Dec 21 17:17:39 EST 2015=0A= =0A= =0A= --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org For additional commands, e-mail: user-help@hadoop.apache.org