hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Pierre <jean-pierre.oca...@247realmedia.com>
Subject [Map/Reduce][HDFS]
Date Thu, 27 Mar 2008 19:41:59 GMT

I'm working on large amount of logs, and I've noticed that the
distribution of data on the network (./hadoop dfs -put input input)
takes a lot of time.

Let's says that my data is already distributed among the network, is
there anyway to say to hadoop to use the already existing
distribution ?.


Jean-Pierre <jean-pierre.ocalan@247realmedia.com>

View raw message