Hi Sergey, thanks for your answer and sorry for my delay.

I'm using Hadoop 2.4.0 and Giraph 1.1. In this version of Giraph, i'm using this one i believe:
https://github.com/apache/giraph/blob/release-1.1/giraph-core/src/main/java/org/apache/giraph/job/JobProgressTrackerService.java#L136

I'm using this job parameters:
-w 4 -yh 5700 -ca giraph.metrics.enable=true,giraph.useOutOfCoreMessages=true,giraph.isStaticGraph=true,giraph.maxAllowedJobTimeMilliseconds=10000

I'm using a cluster of 1 master and 4 slaves in AWS. 

I send attached logs from three different containers. I have a superstep that took 12 seconds and the entire Giraph application doesn't get stopped.

Thanks in advance!

-- 
José Luis Larroque
Analista Programador Universitario - Facultad de Informática - UNLP
Desarrollador Java  en LIFIA

2017-01-25 1:33 GMT-03:00 Sergey Edunov <edunov@gmail.com>:
Hello José,

giraph.maxAllowedJobTimeMilliseconds is supposed to do exactly what
you want, see the code here:
https://github.com/apache/giraph/blob/trunk/giraph-core/src/main/java/org/apache/giraph/job/DefaultJobProgressTrackerService.java#L123

However, I have never tested it with any hadoop distro other than
hadoop 1.0, so maybe it doesn't work in your environment.

Can you share exact configuration (job parameters, and hadoop version)
and what messages do you see in the log?

Regards,
Sergey Edunov


On Tue, Jan 24, 2017 at 7:26 PM, José Luis Larroque
<larroquester@gmail.com> wrote:
> I have to execute several Giraph process in AWS. For doing it, i have a
> script that launch one process after another until all process are finished.
> The problem is that some times, a container gets killed, and i spent a lot
> of time waiting for the entire giraph app gets killed, so the following can
> start. I'm trying to diminish this time, because i know that a process that
> takes more than 5 minutes isn't going to be ended (i prefer get a few giraph
> process being killed, if the maximum time for executing all of them gets
> reduced significantly).
>
> I already try putting a "maximum amount of time" with the following options
> putting a really low value (1 milisecond):
>
> giraph.waitTaskDoneTimeoutMs -> This option make the container throw an
> IllegalStateException but doens's stop the Giraph app from running. I know
> that this option have a bug reported, but i hope that is not the case here.
> giraph.maxAllowedJobTimeMilliseconds -> With LOG level in DEBUG, i couldn't
> see any impact of using this option.
>
> But yet, i'm not getting the expected result, and i have Giraph applications
> that take like 12000 seconds or more (a big waste of time, resources and
> money).
>
>
> Any help will be greatly appreciated.
>
>
> Thanks!
>
>
> --
> José Luis Larroque
> Analista Programador Universitario - Facultad de Informática - UNLP
> Desarrollador Java  en LIFIA