Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BFAB510AD2 for ; Tue, 13 Aug 2013 02:43:53 +0000 (UTC) Received: (qmail 88118 invoked by uid 500); 13 Aug 2013 02:43:52 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 88067 invoked by uid 500); 13 Aug 2013 02:43:51 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 88017 invoked by uid 99); 13 Aug 2013 02:43:49 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Aug 2013 02:43:49 +0000 Date: Tue, 13 Aug 2013 02:43:49 +0000 (UTC) From: "Colin Patrick McCabe (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-4504) DFSOutputStream#close doesn't always release resources (such as leases) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737737#comment-13737737 ] Colin Patrick McCabe commented on HDFS-4504: -------------------------------------------- bq. TreeMaps are memory inefficient, so I'd prefer to use something else unless we need ordering. I believe ConcurrentHashMap has safe concurrent iteration behavior. A balanced tree should take up effectively no memory when it's empty, which this tree will be almost all of the time. In contrast, a hash table needs to pre-reserve a chunk of memory to avoid collisions. Minimum capacity for java.util.HashMap seems to be 16, which means you'll have to spend 16 * sizeof(pointer to jobject) = 128 bytes just to do nothing (which is what you'll be doing most of the time). Also, we do need ordering here. Finally, iterating over hash maps is slow because you have to check all the null entries. A hash map would be a really terrible thing to use here, because if it did ever grow, that would be just a transitory event. And then it would never shrink again, but just continue using that memory forever. Check HashMap.java if you are curious. bq. Any reason for the StreamInfo class over a Long? StreamInfo is mutable; Long is not. Although I could remove and re-insert the element into the map, it seems unnecessary. bq. Unrelated to this patch, but DFSClient#endFileLease is poorly named, since it doesn't actually end the lease. Maybe rename it to stopLeaseRenewal? OK, I'll change it to stopLeaseRenewal. bq. TestHdfsClose: should use GenericTestUtils#assertExceptionContains when catching an expected IOException. It's clearly going to throw some IOException, because the DataNode/NameNode/etc are down, but I don't think the message inside the IOException is important. bq. I noticed that endFileLease is now called before completeFile, could the lease expire between these two calls? I guess it'll still get cleaned up okay by the ZSM, but this is extra work. I suppose I might as well wait until after completeFile to end the lease renewal. As you've already inferred, it's extremely unlikely to make a difference, since if there are absolutely any other files open by the client, the lease will still be renewed. > DFSOutputStream#close doesn't always release resources (such as leases) > ----------------------------------------------------------------------- > > Key: HDFS-4504 > URL: https://issues.apache.org/jira/browse/HDFS-4504 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, HDFS-4504.007.patch, HDFS-4504.008.patch, HDFS-4504.009.patch, HDFS-4504.010.patch > > > {{DFSOutputStream#close}} can throw an {{IOException}} in some cases. One example is if there is a pipeline error and then pipeline recovery fails. Unfortunately, in this case, some of the resources used by the {{DFSOutputStream}} are leaked. One particularly important resource is file leases. > So it's possible for a long-lived HDFS client, such as Flume, to write many blocks to a file, but then fail to close it. Unfortunately, the {{LeaseRenewerThread}} inside the client will continue to renew the lease for the "undead" file. Future attempts to close the file will just rethrow the previous exception, and no progress can be made by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira