hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chia-Hung Lin <cli...@googlemail.com>
Subject Re: KMeansBSP number of BSP tasks
Date Mon, 28 Jul 2014 12:10:50 GMT
Not very sure if I understand your question correctly.

Configuring the number of bsp tasks can be done with BSPJob.setNumBspTask(tasks)

The error message "Cannot create ... already exists as a directory"
looks like HDFS's issue denoting the dest path has already existed.
Probably you need to check if the dest path's already in HDFS and if
the dest path is directory.

On 28 July 2014 16:31, Giannis Giannakopoulos <giannisgiannak@gmail.com> wrote:
> Hello everyone,
> I am trying to run the kmeans clustering algorithm from the hama
> examples, but I face some problems. Specifically, I want to change the
> number of BSP tasks launched, something that is not possible through
> this
> <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hama/hama-examples/0.6.2/org/apache/hama/examples/Kmeans.java>
> , right? (meaning that the number of tasks is determined by the number
> of blocks of the input file).
> To this end, I tried to use the KmeansBSP
> <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hama/hama-ml/0.6.4/org/apache/hama/ml/kmeans/KMeansBSP.java#KMeansBSP.main%28java.lang.String[]%29>
> job which exports as a parameter the number of launched tasks but I
> can;t make it work :$. Specifically, I tried both text and sequence file
> input formats but th job is always failing with the message
> "Cannot create <name of input>; already exists as a directory"
> When putting a non-existing dir, I get the same message.
> Can someone please guide me through this? I want to run KMeans and I
> want to set the number of BSP tasks to launch (even if this means
> partitioning the input file -- I haven't found anything about thuis
> online regarding KMeans).
> Thank you in advance,
> Giannis

View raw message