Hi,
I have used canopy and k-means clustering to cluster around 1.2 M instances.
csv file size if around 425 MB. However when I run "mahout clusterdump"
command as below I am getting
Java OutOfMemory error.
mahout clusterdump -dt sequencefile -i
clean-kmeans-clusters/clusters-1-final/part-r-00000 -n 20 -b 100 -o
cdump-clean.txt -p clean-kmeans-clusters/clusteredPoints/
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:44)
at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:39)
at
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:99)
at
org.apache.mahout.clustering.classify.WeightedVectorWritable.readFields(WeightedVectorWritable.java:56)
I have switched to 64 bit Ubantu and even tried setting 4GB/8GB/12GB of
memory for java.
JAVA_HEAP_MAX=-Xmx4g
JAVA_HEAP_MAX=-Xmx8g
JAVA_HEAP_MAX=-Xmx12g
Not sure how to increase required memory for Java runtime.
How to check is this java on Ubantu is 64 bit or not ?
Thanks
Rajesh
|