hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "javaloveme (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-6827) Failed to traverse Iterable values the second time in reduce() method
Date Fri, 30 Dec 2016 11:57:58 GMT
javaloveme created MAPREDUCE-6827:
-------------------------------------

             Summary: Failed to traverse Iterable values the second time in reduce() method
                 Key: MAPREDUCE-6827
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6827
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: task
    Affects Versions: 3.0.0-alpha1
         Environment: hadoop2.7.3
            Reporter: javaloveme


Failed to traverse Iterable values the second time in reduce() method

The following code is a reduce() method (of WordCount):

	public static class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable>
{

		@Override
		protected void reduce(Text key, Iterable<IntWritable> values, Context context)
				throws IOException, InterruptedException {

			// print some logs
			List<String> vals = new LinkedList<>();
			for(IntWritable i : values) {
				vals.add(i.toString());
			}
			System.out.println(String.format(">>>> reduce(%s, [%s])",
					key, String.join(", ", vals)));

			// sum of values
			int sum = 0;
			for(IntWritable i : values) {
				sum += i.get();
			}
			System.out.println(String.format(">>>> reduced(%s, %s)",
					key, sum));
			
			context.write(key, new IntWritable(sum));
		}			
	}

After running it, we got the result that all sums were zero!

After debugging, it was found that the second foreach-loop was not executed, and the root
cause was the returned value of Iterable.iterator(), it returned the same instance in the
two calls by foreach-loop. In general, Iterable.iterator() should return a new instance in
each call, such as ArrayList.iterator().





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message