abdera-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liutengfei (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ABDERA-303) CIReducer in kmeans doesn't work well
Date Wed, 26 Sep 2012 09:36:07 GMT
liutengfei created ABDERA-303:

             Summary: CIReducer in kmeans doesn't work well
                 Key: ABDERA-303
                 URL: https://issues.apache.org/jira/browse/ABDERA-303
             Project: Abdera
          Issue Type: Bug
         Environment: hadoop:
   hadoop-2.0.0-alpha: pseudo cluster and single node cluster
   hadoop-1.0.3: pseudo cluster
   hadoop-0.20.2:pseudo cluster
   ubuntu 11.04

            Reporter: liutengfei

the function reduce in mahout-0.7-kmeans-CIReducer.java doesn't work well as it looks like.

  protected void reduce(IntWritable key, Iterable<ClusterWritable> values, Context context)
throws IOException,
      InterruptedException {
    Iterator<ClusterWritable> iter = values.iterator();
    ClusterWritable first = null;
    while (iter.hasNext()) {
      ClusterWritable cw = iter.next();
      if (first == null) {
        first = cw;
      } else {
    List<Cluster> models = new ArrayList<Cluster>();
    classifier = new ClusterClassifier(models, policy);
    context.write(key, first);

Apparently´╝î the variable "first" will collect all output data of maps. Actually but, the
value of "first" will change after the code "ClusterWritable cw = iter.next();", same with
this new variable "cw"! I don't why but running result shows that the code runs looks like
this:"ClusterWritable cw  = first = iter.next();".
"cw" is a reference a to "iter"?
is "iter.next" just change the value of "iter" itself to the next?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message