hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind Bhandarkar (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1917) Semantics of map.input.bytes is not consistent
Date Tue, 06 Jul 2010 16:51:49 GMT
Semantics of map.input.bytes is not consistent
----------------------------------------------

                 Key: MAPREDUCE-1917
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1917
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: task
         Environment: All
            Reporter: Milind Bhandarkar
            Assignee: Arun C Murthy


map.input.bytes counter is updated by RecordReader. For sequence files, it is the size of
the raw data, which may be compressed. For text files, it is the size of uncompressed data.
For PigStorage, it is always 0. This request is to have a consistent semantics for this counter.
Since HDFS_BYTES_READ already shows the raw split size read by the mapper, MAP_INPUT_BYTES
should be the size of uncompressed data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message