From hdfs-issues-return-13179-apmail-hadoop-hdfs-issues-archive=hadoop.apache.org@hadoop.apache.org Tue Nov 23 07:17:10 2010 Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 9661 invoked from network); 23 Nov 2010 07:17:10 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 23 Nov 2010 07:17:10 -0000 Received: (qmail 60808 invoked by uid 500); 23 Nov 2010 07:17:41 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 60722 invoked by uid 500); 23 Nov 2010 07:17:41 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 60714 invoked by uid 99); 23 Nov 2010 07:17:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Nov 2010 07:17:40 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Nov 2010 07:17:38 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oAN7HGFv000469 for ; Tue, 23 Nov 2010 07:17:16 GMT Message-ID: <9366534.254731290496636311.JavaMail.jira@thor> Date: Tue, 23 Nov 2010 02:17:16 -0500 (EST) From: "Eli Collins (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Updated: (HDFS-1377) Quota bug for partial blocks allows quotas to be violated MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-1377: ------------------------------ Attachment: hdfs-1377-1.patch Patch for trunk attached. Release audit warnings are a separate issue. No change in test failures. {noformat} [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 9 new or modifi ed tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messa ges. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (ver sion 1.3.9) warnings. [exec] [exec] -1 release audit. The applied patch generated 103 release audit warnings (more than the trunk's current 98 warnings). [exec] [exec] +1 system test framework. The patch passed system test framework compile. [exec] {noformat} > Quota bug for partial blocks allows quotas to be violated > ---------------------------------------------------------- > > Key: HDFS-1377 > URL: https://issues.apache.org/jira/browse/HDFS-1377 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0, 0.23.0 > Reporter: Eli Collins > Assignee: Eli Collins > Priority: Blocker > Fix For: 0.20.3, 0.21.1, 0.22.0, 0.23.0 > > Attachments: hdfs-1377-1.patch, hdfs-1377-b20-1.patch, hdfs-1377-b20-2.patch, hdfs-1377-b20-3.patch, HDFS-1377.patch > > > There's a bug in the quota code that causes them not to be respected when a file is not an exact multiple of the block size. Here's an example: > {code} > $ hadoop fs -mkdir /test > $ hadoop dfsadmin -setSpaceQuota 384M /test > $ ls dir/ | wc -l # dir contains 101 files > 101 > $ du -ms dir # each is 3mb > 304 dir > $ hadoop fs -put dir /test > $ hadoop fs -count -q /test > none inf 402653184 -550502400 2 101 317718528 hdfs://haus01.sf.cloudera.com:10020/test > $ hadoop fs -stat "%o %r" /test/dir/f30 > 134217728 3 # three 128mb blocks > {code} > INodeDirectoryWithQuota caches the number of bytes consumed by it's children in {{diskspace}}. The quota adjustment code has a bug that causes {{diskspace}} to get updated incorrectly when a file is not an exact multiple of the block size (the value ends up being negative). > This causes the quota checking code to think that the files in the directory consumes less space than they actually do, so the verifyQuota does not throw a QuotaExceededException even when the directory is over quota. However the bug isn't visible to users because {{fs count -q}} reports the numbers generated by INode#getContentSummary which adds up the sizes of the blocks rather than use the cached INodeDirectoryWithQuota#diskspace value. > In FSDirectory#addBlock the disk space consumed is set conservatively to the full block size * the number of replicas: > {code} > updateCount(inodes, inodes.length-1, 0, > fileNode.getPreferredBlockSize()*fileNode.getReplication(), true); > {code} > In FSNameSystem#addStoredBlock we adjust for this conservative estimate by subtracting out the difference between the conservative estimate and what the number of bytes actually stored was: > {code} > //Updated space consumed if required. > INodeFile file = (storedBlock != null) ? storedBlock.getINode() : null; > long diff = (file == null) ? 0 : > (file.getPreferredBlockSize() - storedBlock.getNumBytes()); > if (diff > 0 && file.isUnderConstruction() && > cursize < storedBlock.getNumBytes()) { > ... > dir.updateSpaceConsumed(path, 0, -diff*file.getReplication()); > {code} > We do the same in FSDirectory#replaceNode when completing the file, but at a file granularity (I believe the intent here is to correct for the cases when there's a failure replicating blocks and recovery). Since oldnode is under construction INodeFile#diskspaceConsumed will use the preferred block size (vs of Block#getNumBytes used by newnode) so we will again subtract out the difference between the full block size and what the number of bytes actually stored was: > {code} > long dsOld = oldnode.diskspaceConsumed(); > ... > //check if disk space needs to be updated. > long dsNew = 0; > if (updateDiskspace && (dsNew = newnode.diskspaceConsumed()) != dsOld) { > try { > updateSpaceConsumed(path, 0, dsNew-dsOld); > ... > {code} > So in the above example we started with diskspace at 384mb (3 * 128mb) and then we subtract 375mb (to reflect only 9mb raw was actually used) twice so for each file the diskspace for the directory is - 366mb (384mb minus 2 * 375mb). Which is why the quota gets negative and yet we can still write more files. > So a directory with lots of single block files (if you have multiple blocks on the final partial block ends up subtracting from the diskspace used) ends up having a quota that's way off. > I think the fix is to in FSDirectory#replaceNode not have the diskspaceConsumed calculations differ when the old and new INode have the same blocks. I'll work on a patch which also adds a quota test for blocks that are not multiples of the block size and warns in INodeDirectory#computeContentSummary if the computed size does not reflect the cached value. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.