i use the mahout 4.0 release. in mahout-distribution-0.4/bin, i used ./mahout canopy -i /home/space/lucene_clustering/vector/vector -o /home/space/lucene_clustering/canopy/ -dm org.apache.mahout.common.distance.EuclideanDistanceMeasure -t1 0.8 -t2 0.2 -ow int hadoop-env.sh, i add the export HADOOP_HEAPSIZE=20000 export HADOOP_OPTS="-Xmn3g -Xss128k -XX:ParallelGCThreads=20 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31 -XX:+AggressiveOpts -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9004 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" i am sure all the parameters are workable, because i use the jconsole to check the vm paramters. however, after map 100% and reduce 100%, the memory increase from 2.5G to 20G and the exception thrown. the file vector is 30m, 50000 records, which is used for canopy. 10/11/23 16:04:27 INFO mapred.LocalJobRunner: reduce > reduce 10/11/23 16:04:27 INFO mapred.JobClient: map 100% reduce 100% 10/11/23 16:04:30 INFO mapred.LocalJobRunner: reduce > reduce 10/11/23 16:08:17 WARN mapred.LocalJobRunner: job_local_0001 java.lang.OutOfMemoryError: Java heap space at org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434) at org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387) at org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134) at org.apache.mahout.math.AbstractVector.assign(AbstractVector.java:449) at org.apache.mahout.clustering.AbstractCluster.computeParameters(AbstractCluster.java:184) at org.apache.mahout.clustering.canopy.CanopyReducer.reduce(CanopyReducer.java:42) at org.apache.mahout.clustering.canopy.CanopyReducer.reduce(CanopyReducer.java:29) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) 10/11/23 16:08:18 INFO mapred.JobClient: Job complete: job_local_0001 10/11/23 16:08:18 INFO mapred.JobClient: Counters: 12 10/11/23 16:08:18 INFO mapred.JobClient: FileSystemCounters 10/11/23 16:08:18 INFO mapred.JobClient: FILE_BYTES_READ=70413991 10/11/23 16:08:18 INFO mapred.JobClient: FILE_BYTES_WRITTEN=164338288 10/11/23 16:08:18 INFO mapred.JobClient: Map-Reduce Framework 10/11/23 16:08:18 INFO mapred.JobClient: Reduce input groups=1 10/11/23 16:08:18 INFO mapred.JobClient: Combine output records=0 10/11/23 16:08:18 INFO mapred.JobClient: Map input records=50000 10/11/23 16:08:18 INFO mapred.JobClient: Reduce shuffle bytes=0 10/11/23 16:08:18 INFO mapred.JobClient: Reduce output records=227 10/11/23 16:08:18 INFO mapred.JobClient: Spilled Records=64708 10/11/23 16:08:18 INFO mapred.JobClient: Map output bytes=8836211 10/11/23 16:08:18 INFO mapred.JobClient: Combine input records=0 10/11/23 16:08:18 INFO mapred.JobClient: Map output records=32354 10/11/23 16:08:18 INFO mapred.JobClient: Reduce input records=32354 Exception in thread "main" java.lang.InterruptedException: Canopy Job failed processing /home/space/lucene_clustering/vector/vector at org.apache.mahout.clustering.canopy.CanopyDriver.buildClustersMR(CanopyDriver.java:252) at org.apache.mahout.clustering.canopy.CanopyDriver.buildClusters(CanopyDriver.java:167) at org.apache.mahout.clustering.canopy.CanopyDriver.run(CanopyDriver.java:114) at org.apache.mahout.clustering.canopy.CanopyDriver.run(CanopyDriver.java:91) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.clustering.canopy.CanopyDriver.main(CanopyDriver.java:58) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)