hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe Signoret <philippe.signo...@gmail.com>
Subject Re: MAP_INPUT_BYTES missing from counters
Date Sat, 06 Apr 2013 23:10:31 GMT
Nope, regular simple text file (.txt from Guttenberg).

I'll keep looking into it and try to reproduce consistently.

Thanks!
Philippe
On Apr 6, 2013 1:39 PM, "yypvsxf19870706" <yypvsxf19870706@gmail.com> wrote:

> Hi
>
>      Is your input file compressed or named with the suffix gz ,or like
> that?
>      It is interesting .
>      Map_input_bytes is the number of bytes of uncompressed  input
> consumed by all the maps in the job.incremented every time a record is read
> from a RecordReader and passed to the map's map method by framework
> .[Hadoop Definitive Guide page 226]
>
>    Please inform of us ,if you get anything further.
>
> Regards.
>
>
>
> 发自我的 iPhone
>
> 在 2013-4-6,0:01,Philippe Signoret <philippe.signoret@gmail.com> 写道:
>
> I noticed recently that some Word Count jobs I've run are finishing with
> the MAP_INPUT_BYTES counter missing.
>
> I'm using Hadoop 1.1.2 with mostly default configuration with 5 nodes. The
> input was a single 100KB text file.
>
> Questions:
>
>    - Is it normal for any final counters values not to be present?
>    - Is MAP_INPUT_BYTES the best was to determine total input data size?
>    (I do so programmatically, while it's running and after the job is
>    complete.)
>
> The counters I did get:
>
> Job Counters
>  TOTAL_LAUNCHED_REDUCES:1
>  SLOTS_MILLIS_MAPS: 6006
>  FALLOW_SLOTS_MILLIS_REDUCES: 0
>  FALLOW_SLOTS_MILLIS_MAPS: 0
>  TOTAL_LAUNCHED_MAPS: 1
>  DATA_LOCAL_MAPS: 1
>  SLOTS_MILLIS_REDUCES: 9293
> File Output Format Counters
>  BYTES_WRITTEN: 366752
> FileSystemCounters
>  FILE_BYTES_READ: 505552
>  HDFS_BYTES_READ: 1085517
>  FILE_BYTES_WRITTEN: 1122685
>  HDFS_BYTES_WRITTEN: 366752
> File Input Format Counters
>  BYTES_READ: 1085357
> Map-Reduce Framework
>  MAP_OUTPUT_MATERIALIZED_BYTES: 505552
>  MAP_INPUT_RECORDS: 19446
>  REDUCE_SHUFFLE_BYTES: 505552
>  SPILLED_RECORDS: 70358
>  MAP_OUTPUT_BYTES: 1750111
>  CPU_MILLISECONDS: 5700
>  COMMITTED_HEAP_BYTES: 401997824
>  COMBINE_INPUT_RECORDS: 181151
>  SPLIT_RAW_BYTES: 160
>  REDUCE_INPUT_RECORDS: 35179
>  REDUCE_INPUT_GROUPS: 35179
>  COMBINE_OUTPUT_RECORDS:35179
>  PHYSICAL_MEMORY_BYTES: 378482688
>  REDUCE_OUTPUT_RECORDS: 35179
>  VIRTUAL_MEMORY_BYTES: 1139838976
>  MAP_OUTPUT_RECORDS: 181151
>
>
> Here are most of the relevant screens from the JobTracker web interface:
> http://jsfiddle.net/Fguyy/2/embedded/result/
>
> Here is the JobTracker log (relevant time frame):
> http://pastebin.com/dvsMn4fB
>
> Thanks!
> Philippe
>
> -------------------------------
> *Philippe Signoret*
> Skype: philippesignoret
> +33 6 95 89 55 55
>
>

Mime
View raw message