hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Meza <j_meza...@hotmail.com>
Subject copytolocal vs distcp
Date Sat, 09 Mar 2013 18:07:56 GMT
I need suggestions on best methods of copying  alot of data (~6Tb) from a cluster (20-dn) to
the local file system. 
While distcp has much more throughput compared to copytolocal (I think) because it uses MR
jobs,  it doesn't seem to work well with the following syntax   <desturl> =   "file://fs4/outdir/"

Problem: It puts in the home dir for the linux user. To get this to work I need to redefine
the users home dir to the output dir (lun) with lotsa disk space.?
copytolocal is straightforward to use, but lacks the throughput (I think).
Suggestions? Advice?thanksJohn 		 	   		  
View raw message