Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 239A7186CF for ; Tue, 22 Dec 2015 02:25:26 +0000 (UTC) Received: (qmail 93552 invoked by uid 500); 22 Dec 2015 02:25:19 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 93418 invoked by uid 500); 22 Dec 2015 02:25:19 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 93407 invoked by uid 99); 22 Dec 2015 02:25:19 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2015 02:25:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id A2B4DC0ABF for ; Tue, 22 Dec 2015 02:25:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.68 X-Spam-Level: X-Spam-Status: No, score=0.68 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id I57KSv0QEusO for ; Tue, 22 Dec 2015 02:25:12 +0000 (UTC) Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 036F5429E3 for ; Tue, 22 Dec 2015 02:25:12 +0000 (UTC) Received: by mail-wm0-f50.google.com with SMTP id p187so90561783wmp.0 for ; Mon, 21 Dec 2015 18:25:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=7THfxK8dIsqFTtzPRG65DII1kfsYHGLM4niH4Fzjh3c=; b=jUuxudIv2JnUcKo8kUaADbIS9motc83VfE++tgvwa74iFKvcxR2YRr2Xm7BfGGkGvJ PXlhPEm6Z3aqPK4p21s4+qtNvfMnTCCvosA2Ta9oL2h/c9KIEwJfJv3u84tRGXudPzhk BoeLfY5IO73YCtcEdx0jUkSwc1iUE7VqryegK0waiPN1p37GP8Ubs5uvTQQwPgfvkBK/ nRoQJW80U7Wrsz6rG3Qq9GBen4zUZx43c32lQtfBPEuYYAuKrnDQkTrZbfj2Ey6eNQ8g fn5Rw4OrAKGQiR74ZT106uoiWLeFP/N3A/HS46xKb4LBG/khlrcwgPuP/AuhV/+CgYQY zLCw== X-Received: by 10.28.63.200 with SMTP id m191mr25798592wma.67.1450751111270; Mon, 21 Dec 2015 18:25:11 -0800 (PST) MIME-Version: 1.0 Received: by 10.28.169.133 with HTTP; Mon, 21 Dec 2015 18:24:41 -0800 (PST) In-Reply-To: References: From: Namikaze Minato Date: Tue, 22 Dec 2015 03:24:41 +0100 Message-ID: Subject: Re: diagnosing the difference between dfs 'du' and 'df' To: Martin Serrano Cc: "user@hadoop.apache.org" Content-Type: text/plain; charset=UTF-8 This may be a wrong lead, but try to do your "du" command as hdfs user, so that we are sure that we don't miss out read-protected folders. Regards, LLoyd On 22 December 2015 at 03:21, Martin Serrano wrote: > Hi, > > I have an application that is writing data rapidly directly to HDFS > (creates and appends) as well as to HBase (10-15 tables). The disk free > for the filesystem will report that a large percentage of the system is > in use: > > $ hdfs dfs -df -h / > Filesystem Size Used Available Use% > hdfs://ha 882.6 G 472.6 G 409.9 G 54% > > Yet when I try to figure out where the disk space is being used using > dfs -du reports: > > $ hdfs dfs -du -h / > 0 /app-logs > 7.6 G /apps > 382.2 M /hdp > 0 /mapred > 0 /mr-history > 8.5 K /tmp > 3.8 G /user > > A dfsadmin -report during the same time frame is below. I'm trying to > figure out where all of this space is going to. When my application is > killed or quiescent, the df and dfsadmin reports fall in line with what > I would expect. I'm running HDP 2.3 with a default configuration as set > up by Ambari. I'm looking for hints or suggestions on how I can > investigate this issue. It seems crazy that ingesting 12g or so of data > can temporarily consume (reserve?) ~300g of HDFS. > > Thanks, > Martin > > Configured Capacity: 947644268544 (882.56 GB) > Present Capacity: 947064596261 (882.02 GB) > DFS Remaining: 490046627240 (456.39 GB) > DFS Used: 457017969021 (425.63 GB) > DFS Used%: 48.26% > Under replicated blocks: 0 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > Missing blocks (with replication factor 1): 0 > > ------------------------------------------------- > Live datanodes (3): > > Name: *.*.*.*:50010 (**********.com) > Hostname: **********.com > Decommission Status : Normal > Configured Capacity: 315881422848 (294.19 GB) > DFS Used: 218955099179 (203.92 GB) > Non DFS Used: 168255175 (160.46 MB) > DFS Remaining: 96758068494 (90.11 GB) > DFS Used%: 69.32% > DFS Remaining%: 30.63% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 15 > Last contact: Mon Dec 21 17:17:38 EST 2015 > > > Name: *.*.*.*:50010 (**********.com) > Hostname: **********.com > Decommission Status : Normal > Configured Capacity: 315881422848 (294.19 GB) > DFS Used: 218873337575 (203.84 GB) > Non DFS Used: 151608508 (144.59 MB) > DFS Remaining: 96856476765 (90.20 GB) > DFS Used%: 69.29% > DFS Remaining%: 30.66% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 16 > Last contact: Mon Dec 21 17:17:38 EST 2015 > > > Name: *.*.*.*:50010 (*************.com) > Hostname: ***********.com > Decommission Status : Normal > Configured Capacity: 315881422848 (294.19 GB) > DFS Used: 19189532267 (17.87 GB) > Non DFS Used: 259808600 (247.77 MB) > DFS Remaining: 296432081981 (276.07 GB) > DFS Used%: 6.07% > DFS Remaining%: 93.84% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 16 > Last contact: Mon Dec 21 17:17:39 EST 2015 > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org > For additional commands, e-mail: user-help@hadoop.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org For additional commands, e-mail: user-help@hadoop.apache.org