hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grace <syso...@gmail.com>
Subject why do data-local maps still need to do remote reads?
Date Wed, 26 May 2010 01:44:42 GMT
Hi all,

According to the map task scheduling rules, it prefers a task with data
local. And seeing the Data-local map counter(in the job report), it does
have a very high locality for all the map tasks. However, when observing the
metrics of DataNode( read_from_local and read_from_remote) , there is higher
remote read rate than the job reported. It seems the Data-local Map counter
is not so accurate as we expected. I wonder when or why it will trigger the
HDFS remote read while already assigning a data-local map task.

Thanks for your time.

Best Regards,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message