hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB
Date Fri, 20 Jul 2012 07:02:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418974#comment-13418974
] 

Eli Collins commented on HDFS-3577:
-----------------------------------

Hey Nicholas,

Did you test this with distcp? Trying to distcp from a recent trunk build with this change
still fails with *Content-Length header is missing*. Hadoop fs -get using webhdfs with the
same file works.

{noformat}
12/07/19 23:56:43 INFO mapreduce.Job: Task Id : attempt_1342766959778_0002_m_000000_0, Status
: FAILED
Error: java.io.IOException: File copy failed: webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso
--> hdfs://localhost:8020/user/eli/data4/data1/big.iso
	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:154)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:149)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso
to hdfs://localhost:8020/user/eli/data4/data1/big.iso
	at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
	... 10 more
Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException:
Content-Length header is missing
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:201)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:167)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:112)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:90)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:71)
	at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
	... 11 more
Caused by: java.io.IOException: Content-Length header is missing
	at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:125)
	at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103)
	at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:158)
	at java.io.DataInputStream.read(DataInputStream.java:132)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
	at java.io.FilterInputStream.read(FilterInputStream.java:90)
	at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:70)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:198)
	... 16 more
{noformat}
                
> WebHdfsFileSystem can not read files larger than 24KB
> -----------------------------------------------------
>
>                 Key: HDFS-3577
>                 URL: https://issues.apache.org/jira/browse/HDFS-3577
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.23.3, 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.23.3, 2.1.0-alpha
>
>         Attachments: h3577_20120705.patch, h3577_20120708.patch, h3577_20120714.patch,
h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running webhdfs/httpfs uses chunked
transfer encoding (more than 24K in the case of webhdfs), then the WebHdfsFileSystem client
fails with an IOException with message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to *ByteRangeInputStream.URLOpener*
class, which checks for the *Content-Length* header, but when using chunked transfer encoding
the *Content-Length* header is not present and  the *URLOpener.openInputStream()* method thrown
an exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message