hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 麦树荣 <shurong....@qunar.com>
Subject Re: Copy Vs DistCP
Date Wed, 10 Apr 2013 23:28:20 GMT

I think it' better using Copy in the same cluster while using distCP between clusters, and
cp command is a hadoop internal parallel process and will not copy files locally.


From: KayVajj<mailto:vajjalak009@gmail.com>
Date: 2013-04-11 06:20
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Copy Vs DistCP
I have few questions regarding the usage of DistCP for copying files in the same cluster.

1) Which one is better within a  same cluster and what factors (like file size etc) wouldinfluence
the usage of one over te other?

2) when we run a cp command like below from a  client node of the cluster (not a data node),
How does the cp command work
     i) like an MR job
    ii) copy files locally and then it copy it back at the new location.

Example of the copy command

hdfs dfs -cp /<some_location>/file /<new_location>/

Thanks, your responses are appreciated.

-- Kay
View raw message