hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yypvsxf19870706 <yypvsxf19870...@gmail.com>
Subject Re: MAP_INPUT_BYTES missing from counters
Date Sat, 06 Apr 2013 11:37:43 GMT
Hi 

     Is your input file compressed or named with the suffix gz ,or like that?
     It is interesting .
     Map_input_bytes is the number of bytes of uncompressed  input consumed by all the maps
in the job.incremented every time a record is read from a RecordReader and passed to the map's
map method by framework .[Hadoop Definitive Guide page 226]

   Please inform of us ,if you get anything further.

Regards.



发自我的 iPhone

在 2013-4-6,0:01,Philippe Signoret <philippe.signoret@gmail.com> 写道:

> I noticed recently that some Word Count jobs I've run are finishing with the MAP_INPUT_BYTES
counter missing.
> 
> I'm using Hadoop 1.1.2 with mostly default configuration with 5 nodes. The input was
a single 100KB text file.
> 
> Questions:
> Is it normal for any final counters values not to be present?
> Is MAP_INPUT_BYTES the best was to determine total input data size? (I do so programmatically,
while it's running and after the job is complete.)
> The counters I did get:
> 
> Job Counters 
>  TOTAL_LAUNCHED_REDUCES:1
>  SLOTS_MILLIS_MAPS:	6006
>  FALLOW_SLOTS_MILLIS_REDUCES:	0
>  FALLOW_SLOTS_MILLIS_MAPS:	0
>  TOTAL_LAUNCHED_MAPS:	1
>  DATA_LOCAL_MAPS:	1
>  SLOTS_MILLIS_REDUCES:	9293
> File Output Format Counters 
>  BYTES_WRITTEN:		366752
> FileSystemCounters
>  FILE_BYTES_READ:	505552
>  HDFS_BYTES_READ:	1085517
>  FILE_BYTES_WRITTEN:	1122685
>  HDFS_BYTES_WRITTEN:	366752
> File Input Format Counters 
>  BYTES_READ:	1085357
> Map-Reduce Framework
>  MAP_OUTPUT_MATERIALIZED_BYTES:	505552
>  MAP_INPUT_RECORDS:	19446
>  REDUCE_SHUFFLE_BYTES:	505552
>  SPILLED_RECORDS:	70358
>  MAP_OUTPUT_BYTES:	1750111
>  CPU_MILLISECONDS:	5700
>  COMMITTED_HEAP_BYTES:	401997824
>  COMBINE_INPUT_RECORDS:	181151
>  SPLIT_RAW_BYTES:	160
>  REDUCE_INPUT_RECORDS:	35179
>  REDUCE_INPUT_GROUPS:	35179
>  COMBINE_OUTPUT_RECORDS:35179
>  PHYSICAL_MEMORY_BYTES:	378482688
>  REDUCE_OUTPUT_RECORDS:	35179
>  VIRTUAL_MEMORY_BYTES:	1139838976
>  MAP_OUTPUT_RECORDS:	181151
> 
> Here are most of the relevant screens from the JobTracker web interface: http://jsfiddle.net/Fguyy/2/embedded/result/
> 
> Here is the JobTracker log (relevant time frame): http://pastebin.com/dvsMn4fB
> 
> Thanks!
> Philippe
> 
> -------------------------------
> Philippe Signoret
> Skype: philippesignoret
> +33 6 95 89 55 55

Mime
View raw message