hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kris Jirapinyo (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-31) Hadoop distcp tool fails if file path contains special characters + & !
Date Mon, 16 Aug 2010 21:48:17 GMT

    [ https://issues.apache.org/jira/browse/HDFS-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899111#action_12899111
] 

Kris Jirapinyo commented on HDFS-31:
------------------------------------

Yes, that would be nice.

I was using hftp to copy from a 0.20.1 cluster to CDH3 cluster (starting distcp on CDH3 cluster),
and I ran into the same 500 error.  It seems that the url escaping mechanism is making the
final fetch url incorrect.

e.g.

file in HDFS: 
/test/twitteruserout2/_logs/history/mi-prod-app01.ec2.biz360.com_1269013964063_job_201003190852_17784_hadoop_twitter+users+extraction+from+source+on+Tue+Apr+20

fetch filename:
/test/twitteruserout2/_logs/history/mi-prod-app01.ec2.biz360.com_1269013964063_job_201003190852_17784_hadoop_twitter
users extraction from source on Tue Apr 20

Error from specific machine:
2010-08-16 14:33:06,765 WARN org.mortbay.log: /streamFile: java.io.IOException: Cannot open
filename /test/twitteruserout2/_logs/history/mi-prod-app01.ec2.biz360.com_1269013964063_job_201003190852_17784_hadoop_twitter
users extraction from source on Tue Apr 20


Trying to run from http:

http://mi-prod-app28:50075/streamFile?filename=/test/twitteruserout2/_logs/history/mi-prod-app01.ec2.biz360.com_1269013964063_job_201003190852_17784_hadoop_twitter+users+extraction+from+source+on+Tue+Apr+20&ugi=hadoop,hadoop

Doesn't work and will give same error as above.
However, if I replace the + with %2B then the get works.

> Hadoop distcp tool fails if file path contains special characters + & !
> -----------------------------------------------------------------------
>
>                 Key: HDFS-31
>                 URL: https://issues.apache.org/jira/browse/HDFS-31
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 0.20.2, 0.21.0, 0.22.0
>            Reporter: Viraj Bhat
>             Fix For: 0.22.0
>
>
> Copying folders containing + & ! characters between hdfs (using hftp) does not work
in distcp
> For example: 
> Copying  folder "string1+string2"  at "namenode.address.com", hftp port myport to "/myotherhome/folder"
on "myothermachine" does not work 
> myothermachine prompt>>> hadoop --config ~/mycluster/ distcp  "hftp://namenode.address.com:myport/myhome/dir/string1+string2"
 /myotherhome/folder/
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Error results for hadoop job1:
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 08/07/16 00:27:39 INFO tools.DistCp: srcPaths=[hftp://namenode.address.com:myport/myhome/dir/string1+string2]
> 08/07/16 00:27:39 INFO tools.DistCp: destPath=/myotherhome/folder/
> 08/07/16 00:27:41 INFO tools.DistCp: srcCount=2
> 08/07/16 00:27:42 INFO mapred.JobClient: Running job: job1
> 08/07/16 00:27:43 INFO mapred.JobClient:  map 0% reduce 0%
> 08/07/16 00:27:58 INFO mapred.JobClient: Task Id : attempt_1_m_000000_0, Status : FAILED
> java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
>         at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
> 08/07/16 00:28:14 INFO mapred.JobClient: Task Id : attempt_1_m_000000_1, Status : FAILED
> java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
>         at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
> 08/07/16 00:28:28 INFO mapred.JobClient: Task Id : attempt_1_m_000000_2, Status : FAILED
> java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
>         at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:538)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1053)
>         at org.apache.hadoop.tools.DistCp.copy(DistCp.java:615)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:764)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:784)
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Error log for the map task which failed
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> INFO org.apache.hadoop.tools.DistCp: FAIL string1+string2/myjobtrackermachine.com-joblog.tar.gz
: java.io.IOException: Server returned HTTP response code: 500 for URL: http://mymachine.com:myport/streamFile?filename=/myhome/dir/string1+string2/myjobtrackermachine.com-joblog.tar.gz&ugi=myid,mygroup
> 	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1241)
> 	at org.apache.hadoop.dfs.HftpFileSystem.open(HftpFileSystem.java:117)
> 	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:371)
> 	at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:377)
> 	at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:504)
> 	at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:279)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:226)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208)
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message