hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Yang <lin.yang.ja...@gmail.com>
Subject How to run multiple jobs at the same time?
Date Sun, 23 Sep 2012 16:31:19 GMT
Hi, all

I have implemented a K-Means algorithm in MapReduce. This program consists
of many iterations and each iteration is a MapReduce Job. here is my
pseudo-code:

-----
int count  = 0;
do
{
    ....
    SET input path = output path of last iteration;
    SET output path = new path(count);
    ...
    runJob
}
while( (!converged) && (count < maxCount) )
------

Now I got a question that what should I do if I would like to apply this
algorithm on multiple data at the same time?

Because there are dependency btw iterations, so I have to use
JobConf.runJob(), which would block until the iteration finished.

Could I use thread?

BTW, I'm using hadoop-0.20.2
-- 
YANG, Lin

Mime
View raw message