mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshi, Amit Krishna" <joshi...@wright.edu>
Subject RE: Increase timeout for running PFPGrowth
Date Mon, 22 Oct 2012 23:38:42 GMT
Thanks Matt! That didn't help - it would still give the same timeout error. I even tried using
it via hadoop (instead of bin/mahout and setting MAHOUT_OPTS):

$ hadoop jar mahout-examples-0.6-job.jar org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver  -Dmapred.task.timeout=18000000
-Dmapred.child.java.opts=-Xmx4000m  -i in_* -o out_fp -method mapreduce -regex '[,\t]' -s
10000 -g 2000 

(I have tried both: with and without space after -D)

There are two map/reduce jobs run in sequence - the first one runs fine without problem. The
second one dies around 20% reduce every time

INFO mapred.JobClient:  map 100% reduce 18%
12/10/22 19:04:27 INFO mapred.JobClient:  map 100% reduce 19%
12/10/22 19:14:35 INFO mapred.JobClient:  map 100% reduce 0%
12/10/22 19:14:40 INFO mapred.JobClient: Task Id : attempt_201210140938_0108_r_000000_0, Status
: FAILED
Task attempt_201210140938_0108_r_000000_0 failed to report status for 601 seconds. Killing!

-
Amit
________________________________________
From: Matt Molek [mpmolek@gmail.com]
Sent: Monday, October 22, 2012 5:12 PM
To: user@mahout.apache.org
Subject: Re: Increase timeout for running PFPGrowth

Isn't this the same question you asked earlier today?

I responded to the initial one that "-D mapred.task.timeout=18000000"
shouldn't have a space after the D. It should be
"-Dmapred.task.timeout=18000000"

And IIRC, these Hadoop parameters need to go before all of your other
parameters.

On Mon, Oct 22, 2012 at 4:54 PM, Joshi, Amit Krishna
<joshi.35@wright.edu> wrote:
> Hi,
>
> I am running PFPGrowth on several datasets and it works well for smaller ones (< 5GB)
> However, for the larger ones, I keep getting following timeout message.
>
> Task attempt_201210140938_0105_r_000000_0 failed to report status for 600 seconds. Killing!
>
> Is there a way I can increase the timeout?
>
> I even tried passing these parameter but in vain:
> -D mapred.task.timeout=18000000 -D mapred.child.java.opts=-Xmx4000m
>
> My input params are:  -s 10000 -g 1000  -tc 8  -k 50 -method mapreduce
>
> Also, please suggest what would be the optimum value of g and k.
> Number of features: > million
>
>
> Thanks,
> Amit


Mime
View raw message