hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Robertson <>
Subject Data-local map tasks - reported correctly in JT?
Date Wed, 02 May 2012 21:01:56 GMT
Hi all,

I have a 6 node cluster, and on a simple query created with a table from a
CSV, I was seeing a lot of mappers reporting that they were not using data
I changed the replication factor to 6 but still MR is showing only about
60% data locality in the data-local map tasks.

How can this be when I have no under replicated blocks, and replication
count the same as the machine count?  Am I missing something?  Does it
indicate that something is wrong in the MR configuration (E.g. A TT not
recognizing localhost for DN for example)?

The 6 machines each have 12 spindles in them and I'm running Hive 0.7 and
0.9 trunk built about 2 weeks ago.

Many thanks!

View raw message