hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Rapplean <robert.rappl...@trueffect.com>
Subject RE: hftp can list directories but won't send files
Date Tue, 18 Dec 2012 23:05:50 GMT
Thanks for the reply, Arpit.

Yes, both of those produce a correct response, although the second one's syntax is:

hadoop fs -cat logfiles/day_id=19991231/hour_id=1999123123/000008_0

From: Arpit Gupta [mailto:arpit@hortonworks.com]
Sent: Tuesday, December 18, 2012 3:49 PM

Hi Robert

Does the cat work for you if you dont use hftp, something like

hadoop fs -cat hdfs://hdenn00.trueffect.com:8020/user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0x


hadoop fs -cat /user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0x<hdfs://hdenn00.trueffect.com:8020/user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0x>

Arpit Gupta
Hortonworks Inc.

On Dec 18, 2012, at 2:43 PM, Robert Rapplean <robert.rapplean@trueffect.com<mailto:robert.rapplean@trueffect.com>>

Hey, everone. Just got finished reading about all of the unsubscribe messages in Sept-Oct,
and was hoping someone has a clue about what my system is doing wrong. I suspect that this
is a configuration issue, but I don't even know where to start looking for it. I'm a developer,
and my sysadmin is tied up until the end of the year.

I'm trying to move files from one cluster to another using distcp, using the hftp protocol
as specified in their instructions.

I can read directories over hftp, but when I attempt to get a file I get a 500 (internal server
error). To eliminate the possibility of network and firewall issues, I'm using hadoop fs -ls
and hadoop fs -cat commands on the source server in order to attempt to figure out this issue.

This provides a directory of the files, which is correct.

hadoop fs -ls ourlogs/day_id=19991231/hour_id=1999123123
-rw-r--r--   3 username supergroup        812 2012-12-16 17:21 logfiles/day_id=19991231/hour_id=1999123123/000008_0

This gives me a "file not found" error, which is also correct because the file isn't there:

hadoop fs -cat hftp://hdenn00.trueffect.com:50070/user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0x
cat: `hftp://hdenn00.trueffect.com:50070/user/prodman/ods_fail/day_id=19991231/hour_id=1999123123/000008_0x':
No such file or directory

This line gives me a 500 internal server error. The file is confirmed to be on the server.

hadoop fs -cat hftp://hdenn00.trueffect.com:50070/user/username/logfiles/day_id=19991231/hour_id=1999123123/000008_0
cat: HTTP_OK expected, received 500

Here is a stack trace of what distcp logs when I attempt this:

java.io.IOException: HTTP_OK expected, received 500
   at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:365)
   at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119)
   at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103)
   at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187)
   at java.io.DataInputStream.read(DataInputStream.java:83)
   at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:424)
   at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
   at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
   at org.apache.hadoop.mapred.Child.main(Child.java:262)

Can someone tell me why hftp is failing to serve files, or at least where to look?

View raw message