hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5225) SplitSampler in mapreduce.lib should use a SPLIT_STEP to jump around splits
Date Thu, 09 May 2013 04:17:16 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhijie Shen updated MAPREDUCE-5225:
-----------------------------------

    Attachment: MAPREDUCE-5225.1.patch

The patch makes SplitSampler jump by sampling step when it does sampling. Then, the behavior
of SplitSampler in both mapred and mapreduce is identical.
                
> SplitSampler in mapreduce.lib should use a SPLIT_STEP to jump around splits
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5225
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5225
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: MAPREDUCE-5225.1.patch
>
>
> Now, SplitSampler only samples the first maxSplitsSampled splits, caused by MAPREDUCE-1820.
However, jumping around all splits is in general preferable than the first N splits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message