hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3441) Pass the size of the MapReduce input to JobInProgress
Date Mon, 26 May 2008 04:23:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12599748#action_12599748
] 

Amar Kamat commented on HADOOP-3441:
------------------------------------

Shouldn't _input-size_ be part of job conf?

> Pass the size of the MapReduce input to JobInProgress
> -----------------------------------------------------
>
>                 Key: HADOOP-3441
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3441
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.17.0
>         Environment: all
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.18.0
>
>         Attachments: addDataSize.patch
>
>
> Currently, there's no easy way for the JobInProgress to know how large the job's input
data is.
> This patch corrects the problem, by storing the size of the input split's data through
the RawSplit.  The sizes of each split are then totaled up and made available via JobInProgress.getInputSize().
 
> This is needed, among other reasons, so that the JobInProgress knows how much data it's
being run on, which will help build smarter schedulers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message