hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drake민영근 <drake....@nexr.com>
Subject Re: tracking remote reads in datanode logs
Date Tue, 24 Feb 2015 00:51:20 GMT
I found this in the mapreduce am log.

2015-02-23 11:22:45,576 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before
Scheduling: PendingReds:1 ScheduledMaps:5 ScheduledReds:0 AssignedMaps:0
AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0
HostLocal:0 RackLocal:0
2015-02-23 11:22:46,641 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After
Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:5
AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:5 ContRel:0
HostLocal:3 RackLocal:2

The first line says Map tasks are 5 and second says HostLocal 3 and Rack
Local 2. I think the Rack Local 2 are the remote map tasks as you mentioned

Drake 민영근 Ph.D
kt NexR

On Tue, Feb 24, 2015 at 9:45 AM, Drake민영근 <drake.min@nexr.com> wrote:

> Hi, Igor
> Did you look at the mapreduce application master log? I think the local or
> rack local map tasks are logged in the MapReduce AM log.
> Good luck.
> Drake 민영근 Ph.D
> kt NexR
> On Tue, Feb 24, 2015 at 3:30 AM, Igor Bogomolov <igor.bogomolov@gmail.com>
> wrote:
>> Hi all,
>> In a small cluster of 5 nodes that run CDH 5.3.0 (Hadoop 2.5.0) I want
>> to know how many remote map tasks (ones that read input data from remote
>> nodes) there are in a mapreduce job. For this purpose I took logs of each
>> datanode an looked for lines with "op: HDFS_READ" and cliID field that
>> contains map task id.
>> Surprisingly, 4 datanode logs does not contain lines with "op: HDFS_READ".
>> Another 1 has many lines with "op: HDFS_READ" but all cliID look like
>> DFSClient_NONMAPREDUCE_* and does not contain any map task id.
>> I concluded there are no remote map tasks but that does not look correct.
>> Also even local reads are not logged (because there is no line where
>> cliID field contains some map task id). Could anyone please explain
>> what's wrong? Why logging is not working? (I use default settings).
>> Chris,
>> Found HADOOP-3062 <https://issues.apache.org/jira/browse/HADOOP-3062>
>> that you have implemented. Thought you might have an explanation.
>> Best,
>> Igor

View raw message