hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10007) distcp / mv is not working on ftp
Date Fri, 09 Mar 2018 18:06:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16393299#comment-16393299
] 

Steve Loughran commented on HADOOP-10007:
-----------------------------------------

Reopening as I can reproduce this locally. The problem is that you can't rename from a temp
dir to the final destination, and of course, distcp copies to a temp dir and then renames
in. 

We'll need HADOOP-15281 to fix this, which is also needed for perf on s3 & other expensive-to-rename
stores

> distcp / mv is not working on ftp
> ---------------------------------
>
>                 Key: HADOOP-10007
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10007
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>         Environment: Ubuntu 12.04.2 LTS
> Hadoop 2.0.0-cdh4.2.1
> Subversion file:///var/lib/jenkins/workspace/generic-package-ubuntu64-12-04/CDH4.2.1-Packaging-Hadoop-2013-04-22_09-50-19/hadoop-2.0.0+960-1.cdh4.2.1.p0.9~precise/src/hadoop-common-project/hadoop-common
-r 144bd548d481c2774fab2bec2ac2645d190f705b
> Compiled by jenkins on Mon Apr 22 10:26:30 PDT 2013
> From source with checksum aef88defdddfb22327a107fbd7063395
>            Reporter: Fabian Zimmermann
>            Priority: Major
>
> i'm just trying to backup some files to our ftp-server.
> hadoop distcp hdfs:///data/ ftp://user:pass@server/data/
> returns after some minutes with:
> Task TASKID="task_201308231529_97700_m_000002" TASK_TYPE="MAP" TASK_STATUS="FAILED" FINISH_TIME="1380217916479"
ERROR="java\.io\.IOException: Cannot rename parent(source): ftp://x:x@backup2/data/, parent(destination):
 ftp://x:x@backup2/data/
> 	at org\.apache\.hadoop\.fs\.ftp\.FTPFileSystem\.rename(FTPFileSystem\.java:557)
> 	at org\.apache\.hadoop\.fs\.ftp\.FTPFileSystem\.rename(FTPFileSystem\.java:522)
> 	at org\.apache\.hadoop\.mapred\.FileOutputCommitter\.moveTaskOutputs(FileOutputCommitter\.java:154)
> 	at org\.apache\.hadoop\.mapred\.FileOutputCommitter\.moveTaskOutputs(FileOutputCommitter\.java:172)
> 	at org\.apache\.hadoop\.mapred\.FileOutputCommitter\.commitTask(FileOutputCommitter\.java:132)
> 	at org\.apache\.hadoop\.mapred\.OutputCommitter\.commitTask(OutputCommitter\.java:221)
> 	at org\.apache\.hadoop\.mapred\.Task\.commit(Task\.java:1000)
> 	at org\.apache\.hadoop\.mapred\.Task\.done(Task\.java:870)
> 	at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:329)
> 	at org\.apache\.hadoop\.mapred\.Child$4\.run" TASK_ATTEMPT_ID="" .
> I googled a bit and added
> fs.ftp.host = backup2
> fs.ftp.user.backup2 = user
> fs.ftp.password.backup2 = password
> to core-site.xml, then I was able to execute:
> hadoop fs -ls ftp:///data/
> hadoop fs -rm ftp:///data/test.file
> but as soon as I try
> hadoop fs -mv file:///data/test.file ftp:///data/test2.file
> mv: `ftp:///data/test.file': Input/output error
> I enabled debug-logging in our ftp-server and got:
> Sep 27 15:24:33 backup2 ftpd[38241]: command: LIST /data
> Sep 27 15:24:33 backup2 ftpd[38241]: <--- 150
> Sep 27 15:24:33 backup2 ftpd[38241]: Opening BINARY mode data connection for '/bin/ls'.
> Sep 27 15:24:33 backup2 ftpd[38241]: <--- 226
> Sep 27 15:24:33 backup2 ftpd[38241]: Transfer complete.
> Sep 27 15:24:33 backup2 ftpd[38241]: command: CWD ftp:/data
> Sep 27 15:24:33 backup2 ftpd[38241]: <--- 550
> Sep 27 15:24:33 backup2 ftpd[38241]: ftp:/data: No such file or directory.
> Sep 27 15:24:33 backup2 ftpd[38241]: command: RNFR test.file
> Sep 27 15:24:33 backup2 ftpd[38241]: <--- 550
> looks like the generation of "CWD" is buggy, hadoop tries to cd into "ftp:/data", but
should use "/data"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message