hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lbkzman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.
Date Thu, 05 Feb 2015 14:47:34 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

lbkzman updated MAPREDUCE-6245:
-------------------------------
    Status: Patch Available  (was: Open)

index 72b47f2..8b89782 100644
--- src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
+++ src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
@@ -203,12 +203,8 @@ public class InputSampler<K,V> extends Configured implement
s Tool  {
       r.setSeed(seed);
       LOG.debug("seed: " + seed);
       // shuffle splits
-      for (int i = 0; i < splits.size(); ++i) {
-        InputSplit tmp = splits.get(i);
-        int j = r.nextInt(splits.size());
-        splits.set(i, splits.get(j));
-        splits.set(j, tmp);
-      }
+      Collections.shuffle(splits);      
+
       // our target rate is in terms of the maximum number of sample splits,
       // but we accept the possibility of sampling additional splits to hit
       // the target sample keyset


> Fixed split shuffling.
> ----------------------
>
>                 Key: MAPREDUCE-6245
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: lbkzman
>            Assignee: lbkzman
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message