giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Luis Larroque <larroques...@gmail.com>
Subject Re: Stop Giraph application when reach certain amount of time
Date Fri, 27 Jan 2017 02:32:22 GMT
I believe that i found something related to this issue.

The behavior when maxAllowedJobTimeMilliseconds is set is strongly related
to giraph.trackJobProgressOnClient option, which is set on *false* by
default.

For stoping the job when the time reach to the
maxAllowedJobTimeMilliseconds value,
the method mapperStarted() of JobProgressTrackerService should be executed.

In Giraph 1.1, the giraph.trackJobProgressOnClient configuration option is
in false by default. When this happens, a JobProgressTrackerClientNoOp is
created for tracking progress on client. This class have the
mapperStarted() method implemented, but with an *empty* body, this means
that nothing is done, and this means that the thread that should be created
for killing the job in a maximum amount of time is not created at all, and
that's why i'm not seeing LOG information related to this option on logs.

I try to run a job with the  giraph.trackJobProgressOnClient set in true,
but when i did this all my containers get this exception:
java.lang.NoClassDefFoundError: org/apache/thrift/transport/TTransport

Apparently, when i put the giraph.trackJobProgressOnClient on true, a
RetryableJobProgressTrackerClient client is created instead of
JobProgressTrackerClientNoOp, and RetryableJobProgressTrackerClient uses
classes fhat i don't have available on my classpath like
org/apache/thrift/transport/TTransport. Should i start to download jars
dependencies until the NoClassDeffFoundError is solved, or there is a
better workaround for this problem?

Any help will be greatly appreciated.

bye!
José




-- 
*José Luis Larroque*
Analista Programador Universitario - Facultad de Informática - UNLP
Desarrollador Java  en LIFIA

2017-01-25 22:51 GMT-03:00 José Luis Larroque <larroquester@gmail.com>:

> Sorry, i forgot to attach the log files, here they are:
>
>
> --
> *José Luis Larroque*
> Analista Programador Universitario - Facultad de Informática - UNLP
> Desarrollador Java  en LIFIA
>
> 2017-01-25 22:50 GMT-03:00 José Luis Larroque <larroquester@gmail.com>:
>
>> Hi Sergey, thanks for your answer and sorry for my delay.
>>
>> I'm using Hadoop 2.4.0 and Giraph 1.1. In this version of Giraph, i'm
>> using this one i believe:
>> https://github.com/apache/giraph/blob/release-1.1/giraph-cor
>> e/src/main/java/org/apache/giraph/job/JobProgressTrackerService.java#L136
>>
>> I'm using this job parameters:
>> -w 4 -yh 5700 -ca giraph.metrics.enable=true,gir
>> aph.useOutOfCoreMessages=true,giraph.isStaticGraph=true,gira
>> ph.maxAllowedJobTimeMilliseconds=10000
>>
>> I'm using a cluster of 1 master and 4 slaves in AWS.
>>
>> I send attached logs from three different containers. I have a superstep
>> that took 12 seconds and the entire Giraph application doesn't get stopped.
>>
>> Thanks in advance!
>>
>> --
>> *José Luis Larroque*
>> Analista Programador Universitario - Facultad de Informática - UNLP
>> Desarrollador Java  en LIFIA
>>
>> 2017-01-25 1:33 GMT-03:00 Sergey Edunov <edunov@gmail.com>:
>>
>>> Hello José,
>>>
>>> giraph.maxAllowedJobTimeMilliseconds is supposed to do exactly what
>>> you want, see the code here:
>>> https://github.com/apache/giraph/blob/trunk/giraph-core/src/
>>> main/java/org/apache/giraph/job/DefaultJobProgressTrackerSer
>>> vice.java#L123
>>>
>>> However, I have never tested it with any hadoop distro other than
>>> hadoop 1.0, so maybe it doesn't work in your environment.
>>>
>>> Can you share exact configuration (job parameters, and hadoop version)
>>> and what messages do you see in the log?
>>>
>>> Regards,
>>> Sergey Edunov
>>>
>>>
>>> On Tue, Jan 24, 2017 at 7:26 PM, José Luis Larroque
>>> <larroquester@gmail.com> wrote:
>>> > I have to execute several Giraph process in AWS. For doing it, i have a
>>> > script that launch one process after another until all process are
>>> finished.
>>> > The problem is that some times, a container gets killed, and i spent a
>>> lot
>>> > of time waiting for the entire giraph app gets killed, so the
>>> following can
>>> > start. I'm trying to diminish this time, because i know that a process
>>> that
>>> > takes more than 5 minutes isn't going to be ended (i prefer get a few
>>> giraph
>>> > process being killed, if the maximum time for executing all of them
>>> gets
>>> > reduced significantly).
>>> >
>>> > I already try putting a "maximum amount of time" with the following
>>> options
>>> > putting a really low value (1 milisecond):
>>> >
>>> > giraph.waitTaskDoneTimeoutMs -> This option make the container throw
an
>>> > IllegalStateException but doens's stop the Giraph app from running. I
>>> know
>>> > that this option have a bug reported, but i hope that is not the case
>>> here.
>>> > giraph.maxAllowedJobTimeMilliseconds -> With LOG level in DEBUG, i
>>> couldn't
>>> > see any impact of using this option.
>>> >
>>> > But yet, i'm not getting the expected result, and i have Giraph
>>> applications
>>> > that take like 12000 seconds or more (a big waste of time, resources
>>> and
>>> > money).
>>> >
>>> >
>>> > Any help will be greatly appreciated.
>>> >
>>> >
>>> > Thanks!
>>> >
>>> >
>>> > --
>>> > José Luis Larroque
>>> > Analista Programador Universitario - Facultad de Informática - UNLP
>>> > Desarrollador Java  en LIFIA
>>>
>>
>>
>

Mime
View raw message