Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 465DFC14E for ; Fri, 4 May 2012 15:57:12 +0000 (UTC) Received: (qmail 23780 invoked by uid 500); 4 May 2012 15:57:12 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 23753 invoked by uid 500); 4 May 2012 15:57:12 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 23745 invoked by uid 99); 4 May 2012 15:57:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 May 2012 15:57:12 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 May 2012 15:57:09 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id A9C10430348 for ; Fri, 4 May 2012 15:56:48 +0000 (UTC) Date: Fri, 4 May 2012 15:56:48 +0000 (UTC) From: "Kihwal Lee (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <669411871.27511.1336147008696.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <974944667.37015.1331165576997.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3061) Cached directory size in INodeDirectory can get permantently out of sync with computed size, causing quota issues MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268475#comment-13268475 ] Kihwal Lee commented on HDFS-3061: ---------------------------------- Since HDFS-1487 is closed, I will track the work in this jira. > Cached directory size in INodeDirectory can get permantently out of sync with computed size, causing quota issues > ----------------------------------------------------------------------------------------------------------------- > > Key: HDFS-3061 > URL: https://issues.apache.org/jira/browse/HDFS-3061 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.20.203.0, 0.23.0, 1.0.2 > Environment: 0.20.203 with HDFS-1377 and HDFS-2053 patches applied > Reporter: Alex Holmes > Assignee: Kihwal Lee > Priority: Blocker > Fix For: 1.1.0, 0.23.3, 1.0.3, 2.0.0, 3.0.0 > > Attachments: QuotaTestSimple.java > > > It appears that there's a condition under which a HDFS directory with a space quota set can get to a point where the cached size for the directory can permanently differ from the computed value. When this happens the following command: > {code} > hadoop fs -count -q /tmp/quota-test > {code} > results in the following output in the NameNode logs: > {code} > WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Inconsistent diskspace for directory quota-test. Cached: 6000 Computed: 6072 > {code} > I've observed both transient and persistent instances of this happening. In the transient instances this warning goes away, but in the persistent instances every invocation of the {{fs -count -q}} command yields the above warning. > I've seen instances where the actual disk usage of a directory is 25% of the cached value in INodeDirectory, which creates problems since the quota code uses this cached value to determine whether block write requests are permitted. > This isn't easy to reproduce - I am able to (inconsistently) get HDFS into this state with a simple program which: > # Writes files into HDFS > # When a DSQuotaExceededException is encountered removes all files created in step 1 > # Repeat step 1 > I'm going to try and come up with a more repeatable test case to reproduce this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira