hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3318) Hftp hangs on transfers >2GB
Date Tue, 24 Apr 2012 23:08:07 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261116#comment-13261116
] 

Kihwal Lee commented on HDFS-3318:
----------------------------------

Before introduction of the client-side timeout in hftp, server-side would timeout in 200 seconds,
which is the jetty keepalive timeout. Currently when the client-side times out, which is smaller
than 200 seconds, hftp client thinks transfer has failed since it does not detect the end
of transfer based on the content length header. This doesn't seem to happen when the file
size is < 2GB.  HttpURLConnection.getContentLength() returns an int (max: 2^32-1) and it
might be internally keeping track of progress as long as content-length is < 2GB.

As a side effect of the fix, it will shed 200 seconds off transfer times for files bigger
than 2GB (for pre hftp client timeout), since it will no longer wait for the server side to
close the connection.
                
> Hftp hangs on transfers >2GB
> ----------------------------
>
>                 Key: HDFS-3318
>                 URL: https://issues.apache.org/jira/browse/HDFS-3318
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.24.0, 0.23.3, 2.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HDFS-3318.patch
>
>
> Hftp transfers >2GB hang after the transfer is complete.  The problem appears to be
caused by java internally using an int for the content length.  When it overflows 2GB, it
won't check the bounds of the reads on the input stream.  The client continues reading after
all data is received, and the client blocks until the server times out the connection -- _many_
minutes later.  In conjunction with hftp timeouts, all transfers >2G fail with a read timeout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message