hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <hadoop.supp...@visolve.com>
Subject RE: Copy data between clusters during the job execution.
Date Tue, 03 Feb 2015 05:23:19 GMT
It seems in your first error message, you have missed the source directory argument by a bit.
One common usage of distcp is :


Distcp (solution to your problem)

hadoop distcp hdfs://hadoop-coc-1:50070/input1 hdfs://hadoop-coc-2:50070/some1


It is also wise to use latest tool: 


hadoop distcp hdfs://hadoop-coc-1:50070/input1 hdfs://hadoop-coc-2:50070/some1


Where hdfs://hadoop-coc-2:50070/some1 represents source directory on another node. 



If you need, you can provide multiple directories using “\”  option the command


Thanks and Regards, 
Hadoop Support Team 
ViSolve Inc.| <http://www.visolve.com/> www.visolve.com



From: dbist13@gmail.com [mailto:dbist13@gmail.com] On Behalf Of Artem Ervits
Sent: Tuesday, February 03, 2015 6:49 AM
To: user@hadoop.apache.org
Subject: Re: Copy data between clusters during the job execution.


take a look at oozie, once first job completes you can distcp to another server.

Artem Ervits

On Feb 2, 2015 5:46 AM, "Daniel Haviv" <danielrulez@gmail.com <mailto:danielrulez@gmail.com>
> wrote:

It should run after your job finishes.

You can create the flow using a simple bash script


On 2 בפבר׳ 2015, at 12:31, xeonmailinglist <xeonmailinglist@gmail.com <mailto:xeonmailinglist@gmail.com>
> wrote:

But can I use discp inside my job, or I need to program something that executes distcp after
executing my job?

On 02-02-2015 10:20, Daniel Haviv wrote:

an use distcp




On 2 בפבר׳ 2015, at 11:12, 


View raw message