hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuanbo Liu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-13593) `hadoop distcp -atomic` invokes improper host check while copying data from HDFS to Swift
Date Mon, 12 Sep 2016 09:57:21 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483686#comment-15483686
] 

Yuanbo Liu edited comment on HADOOP-13593 at 9/12/16 9:56 AM:
--------------------------------------------------------------

[~steve_l] Thanks a lot for your comments, that's really helpful.
{quote}
1. please can you stick the full stack trace of the exception in as a comment..
{quote}
Sorry for omitting the stack info and I will edit my comment 1 to add the information.

{quote}
2. anything checking hostnames is going
{quote}
In fact there is a code segment in {{FileUtils#compareFs}} as below:
{code}
String srcHost = srcUri.getHost();
String dstHost = dstUri.getHost();
if (!srcHost.equals(dstHost)) {
        return false;
}
{code}
and I think it can cover the case you mentioned above. Using "getCanonicalHostName" to double
check whether hosts are equal seems good, but if the host name is an alias name, it may throw
UnknownHostException here. If you don't agree to remove the check, at least we can do is to
make the output info more accurate, "Work path..in different file system" is not right.

{quote}
3. none of the object stores support atomic renames...
{quote]
Thanks for your info, yes you're right, if object store doesn't support atomic rename, it's
not proper to use `distcp -atomic` here.

{quote}
If there were to be a patch on this, it'd need tests. Here I'd recommend 
{quote}
Thanks for your suggestions. I will investigate them later.
Thanks again for your time!


was (Author: yuanbo):
[~steve_l] Thanks a lot for your comments, that's really helpful.
{quote}
1. please can you stick the full stack trace of the exception in as a comment..
{quote}
Sorry for omitting the stack and I will edit my comment 1 to add the information.

{quote}
2. anything checking hostnames is going
{quote}
In fact there is a code segment in {{FileUtils#compareFs}} as below:
{code}
String srcHost = srcUri.getHost();
String dstHost = dstUri.getHost();
if (!srcHost.equals(dstHost)) {
        return false;
}
{code}
and I think it can cover the case you mentioned above. Using "getCanonicalHostName" to double
check whether hosts are equal seems good, but if the host name is an alias name, it may throw
UnknownHostException here. If you don't agree to remove the check, at least we can do is to
make the output info more accurate, "Work path..in different file system" is not right.

{quote}
3. none of the object stores support atomic renames...
{quote]
Thanks for your info, yes you're right, if object store doesn't support atomic rename, it's
not proper to use `distcp -atomic` here.

{quote}
If there were to be a patch on this, it'd need tests. Here I'd recommend 
{quote}
Thanks for your suggestions. I will investigate them later.
Thanks again for your time!

> `hadoop distcp -atomic` invokes improper host check while copying data from HDFS to Swift
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13593
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13593
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Yuanbo Liu
>         Attachments: HADOOP-13593.001.patch, HADOOP-13593.002.patch
>
>
> While copying data from HDFS to Swift by using `hadoop distcp -atomic`, for example:
> {code}
> hadoop distcp -atomic /tmp/100M  swift://testhadoop.softlayer//tmp
> {code}
> it throws
> {code}
> java.lang.IllegalArgumentException: Work path swift://testhadoop.softlayer/._WIP_tmp546958075
and target path swift://testhadoop.softlayer/tmp are in different file system
> 	at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:351)
> .....
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message