hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drake민영근 <drake....@nexr.com>
Subject Re: tracking remote reads in datanode logs
Date Tue, 24 Feb 2015 00:45:22 GMT
Hi, Igor

Did you look at the mapreduce application master log? I think the local or
rack local map tasks are logged in the MapReduce AM log.

Good luck.

Drake 민영근 Ph.D
kt NexR

On Tue, Feb 24, 2015 at 3:30 AM, Igor Bogomolov <igor.bogomolov@gmail.com>

> Hi all,
> In a small cluster of 5 nodes that run CDH 5.3.0 (Hadoop 2.5.0) I want to
> know how many remote map tasks (ones that read input data from remote
> nodes) there are in a mapreduce job. For this purpose I took logs of each
> datanode an looked for lines with "op: HDFS_READ" and cliID field that
> contains map task id.
> Surprisingly, 4 datanode logs does not contain lines with "op: HDFS_READ".
> Another 1 has many lines with "op: HDFS_READ" but all cliID look like
> DFSClient_NONMAPREDUCE_* and does not contain any map task id.
> I concluded there are no remote map tasks but that does not look correct.
> Also even local reads are not logged (because there is no line where cliID
> field contains some map task id). Could anyone please explain what's wrong?
> Why logging is not working? (I use default settings).
> Chris,
> Found HADOOP-3062 <https://issues.apache.org/jira/browse/HADOOP-3062>
> that you have implemented. Thought you might have an explanation.
> Best,
> Igor

View raw message