Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4DE23196F2 for ; Mon, 18 Apr 2016 18:01:26 +0000 (UTC) Received: (qmail 5293 invoked by uid 500); 18 Apr 2016 18:01:26 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 5213 invoked by uid 500); 18 Apr 2016 18:01:26 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 4990 invoked by uid 99); 18 Apr 2016 18:01:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Apr 2016 18:01:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id B186E2C14F7 for ; Mon, 18 Apr 2016 18:01:25 +0000 (UTC) Date: Mon, 18 Apr 2016 18:01:25 +0000 (UTC) From: "Ravi Prakash (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-9530) huge Non-DFS Used in hadoop 2.6.2 & 2.7.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246179#comment-15246179 ] Ravi Prakash commented on HDFS-9530: ------------------------------------ bq. 1. Reservation happens only when the block is being received using BlockReceiver. No other places reservation happens, so no need to release as well. Thanks for reminding me Brahma! Do you think we should change {{reservedForReplicas}} when a datanode is started up and an older RBW replica is recovered? Specifically [BlockPoolSlice.getVolumeMap|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java#L361] {{addToReplicasMap(volumeMap, rbwDir, lazyWriteReplicaMap, false);}} . Also it seems to me, since we aren't calling {{reserveSpaceForReplica}} in BlockReceiver but instead at a lower level, we will have to worry about calling {{releaseReservedSpace}} at that lower level. {quote}2. BlockReceiver constructor have a try-catch block where it will release all the bytes reserved, if there is any exceptions after reserving. 3. BlockReceiver#receiveBlock() have the try-catch block where it will release all the bytes reserved if there is any exceptions during the receiving process.{quote} Could you please point me to the code where you see this happening? I mean specific instances of {{FsVolumeImpl.releaseReservedSpace}} being called with the stack trace. bq. Only place left is in DataXceiver#writeBlock(), exception can happen after creation of BlockReceiver and before calling BlockReceiver#receiveBlock(), if failed to connect to Mirror nodes. Do you mean to imply that the places I found in [this comment|https://issues.apache.org/jira/browse/HDFS-9530?focusedCommentId=15231164&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15231164] need not call {{reserveSpaceForReplica}} / {{releaseReservedSpace}} ? > huge Non-DFS Used in hadoop 2.6.2 & 2.7.1 > ----------------------------------------- > > Key: HDFS-9530 > URL: https://issues.apache.org/jira/browse/HDFS-9530 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.7.1 > Reporter: Fei Hui > Attachments: HDFS-9530-01.patch > > > i think there are bugs in HDFS > =============================================================================== > here is config > > dfs.datanode.data.dir > > file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2 > > > here is dfsadmin report > [hadoop@worker-1 ~]$ hadoop dfsadmin -report > DEPRECATED: Use of this script to execute hdfs command is deprecated. > Instead use the hdfs command for it. > Configured Capacity: 240769253376 (224.23 GB) > Present Capacity: 238604832768 (222.22 GB) > DFS Remaining: 215772954624 (200.95 GB) > DFS Used: 22831878144 (21.26 GB) > DFS Used%: 9.57% > Under replicated blocks: 4 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > ------------------------------------------------- > Live datanodes (3): > Name: 10.117.60.59:50010 (worker-2) > Hostname: worker-2 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 7190958080 (6.70 GB) > Non DFS Used: 721473536 (688.05 MB) > DFS Remaining: 72343986176 (67.38 GB) > DFS Used%: 8.96% > DFS Remaining%: 90.14% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 1 > Last contact: Wed Dec 09 15:55:02 CST 2015 > Name: 10.168.156.0:50010 (worker-3) > Hostname: worker-3 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 7219073024 (6.72 GB) > Non DFS Used: 721473536 (688.05 MB) > DFS Remaining: 72315871232 (67.35 GB) > DFS Used%: 9.00% > DFS Remaining%: 90.11% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 1 > Last contact: Wed Dec 09 15:55:03 CST 2015 > Name: 10.117.15.38:50010 (worker-1) > Hostname: worker-1 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 8421847040 (7.84 GB) > Non DFS Used: 721473536 (688.05 MB) > DFS Remaining: 71113097216 (66.23 GB) > DFS Used%: 10.49% > DFS Remaining%: 88.61% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 1 > Last contact: Wed Dec 09 15:55:03 CST 2015 > ================================================================================ > when running hive job , dfsadmin report as follows > [hadoop@worker-1 ~]$ hadoop dfsadmin -report > DEPRECATED: Use of this script to execute hdfs command is deprecated. > Instead use the hdfs command for it. > Configured Capacity: 240769253376 (224.23 GB) > Present Capacity: 108266011136 (100.83 GB) > DFS Remaining: 80078416384 (74.58 GB) > DFS Used: 28187594752 (26.25 GB) > DFS Used%: 26.04% > Under replicated blocks: 7 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > ------------------------------------------------- > Live datanodes (3): > Name: 10.117.60.59:50010 (worker-2) > Hostname: worker-2 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 9015627776 (8.40 GB) > Non DFS Used: 44303742464 (41.26 GB) > DFS Remaining: 26937047552 (25.09 GB) > DFS Used%: 11.23% > DFS Remaining%: 33.56% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 693 > Last contact: Wed Dec 09 15:37:35 CST 2015 > Name: 10.168.156.0:50010 (worker-3) > Hostname: worker-3 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 9163116544 (8.53 GB) > Non DFS Used: 47895897600 (44.61 GB) > DFS Remaining: 23197403648 (21.60 GB) > DFS Used%: 11.42% > DFS Remaining%: 28.90% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 750 > Last contact: Wed Dec 09 15:37:36 CST 2015 > Name: 10.117.15.38:50010 (worker-1) > Hostname: worker-1 > Decommission Status : Normal > Configured Capacity: 80256417792 (74.74 GB) > DFS Used: 10008850432 (9.32 GB) > Non DFS Used: 40303602176 (37.54 GB) > DFS Remaining: 29943965184 (27.89 GB) > DFS Used%: 12.47% > DFS Remaining%: 37.31% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 632 > Last contact: Wed Dec 09 15:37:36 CST 2015 > ========================================================================= > but, df output is as follows on worker-1 > [hadoop@worker-1 ~]$ df > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/xvda1 20641404 4229676 15363204 22% / > tmpfs 8165456 0 8165456 0% /dev/shm > /dev/xvdc 20642428 2596920 16996932 14% /mnt/disk3 > /dev/xvdb 20642428 2692228 16901624 14% /mnt/disk4 > /dev/xvdd 20642428 2445852 17148000 13% /mnt/disk2 > /dev/xvde 20642428 2909764 16684088 15% /mnt/disk1 > df output conflitcs with dfsadmin report > any suggestions? -- This message was sent by Atlassian JIRA (v6.3.4#6332)