hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prasenjit mukherjee <prasen....@gmail.com>
Subject distributing hadoop push
Date Sat, 23 Jan 2010 16:57:19 GMT
I have  hundreds of large files  ( ~ 100MB ) in a  /mnt/ location which is
shared by all my hadoop nodes. Was wondering if I could directly use "hadoop
distcp file:///mnt/data/tr* /input" to parallelize/distribute hadoop push.
Hadoop push is indeed becoming a bottle neck for me and any help in this
regard is greatly appreciated.  Currently I am using "hadoop -moveFromlocal
..." and it is taking too much of time.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message