hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4504) DFSOutputStream#close doesn't always release resources (such as leases)
Date Tue, 13 Aug 2013 02:43:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737737#comment-13737737

Colin Patrick McCabe commented on HDFS-4504:

bq. TreeMaps are memory inefficient, so I'd prefer to use something else unless we need ordering.
I believe ConcurrentHashMap has safe concurrent iteration behavior.

A balanced tree should take up effectively no memory when it's empty, which this tree will
be almost all of the time.  In contrast, a hash table needs to pre-reserve a chunk of memory
to avoid collisions.  Minimum capacity for java.util.HashMap seems to be 16, which means you'll
have to spend 16 * sizeof(pointer to jobject) = 128 bytes just to do nothing (which is what
you'll be doing most of the time).  Also, we do need ordering here.  Finally, iterating over
hash maps is slow because you have to check all the null entries.

A hash map would be a really terrible thing to use here, because if it did ever grow, that
would be just a transitory event.  And then it would never shrink again, but just continue
using that memory forever.  Check HashMap.java if you are curious.

bq. Any reason for the StreamInfo class over a Long?

StreamInfo is mutable; Long is not.  Although I could remove and re-insert the element into
the map, it seems unnecessary.

bq. Unrelated to this patch, but DFSClient#endFileLease is poorly named, since it doesn't
actually end the lease. Maybe rename it to stopLeaseRenewal?

OK, I'll change it to stopLeaseRenewal.

bq. TestHdfsClose: should use GenericTestUtils#assertExceptionContains when catching an expected

It's clearly going to throw some IOException, because the DataNode/NameNode/etc are down,
but I don't think the message inside the IOException is important.

bq. I noticed that endFileLease is now called before completeFile, could the lease expire
between these two calls? I guess it'll still get cleaned up okay by the ZSM, but this is extra

I suppose I might as well wait until after completeFile to end the lease renewal.  As you've
already inferred, it's extremely unlikely to make a difference, since if there are absolutely
any other files open by the client, the lease will still be renewed.
> DFSOutputStream#close doesn't always release resources (such as leases)
> -----------------------------------------------------------------------
>                 Key: HDFS-4504
>                 URL: https://issues.apache.org/jira/browse/HDFS-4504
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, HDFS-4504.007.patch, HDFS-4504.008.patch,
HDFS-4504.009.patch, HDFS-4504.010.patch
> {{DFSOutputStream#close}} can throw an {{IOException}} in some cases.  One example is
if there is a pipeline error and then pipeline recovery fails.  Unfortunately, in this case,
some of the resources used by the {{DFSOutputStream}} are leaked.  One particularly important
resource is file leases.
> So it's possible for a long-lived HDFS client, such as Flume, to write many blocks to
a file, but then fail to close it.  Unfortunately, the {{LeaseRenewerThread}} inside the client
will continue to renew the lease for the "undead" file.  Future attempts to close the file
will just rethrow the previous exception, and no progress can be made by the client.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message