hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: Lack of data locality in Hadoop-0.20.2
Date Tue, 12 Jul 2011 18:20:21 GMT

On Jul 12, 2011, at 10:27 AM, Virajith Jalaparti wrote:

> I agree that the scheduler has lesser leeway when the replication factor is
> 1. However, I would still expect the number of data-local tasks to be more
> than 10% even when the replication factor is 1.

	How did you load your data?

	Did you load it from outside the grid or from one of the datanodes?  If you loaded from one
of the datanodes, you'll basically have no real locality, especially with a rep factor of

View raw message