hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fengyun RAO <raofeng...@gmail.com>
Subject Re: YarnException: Unauthorized request to start container. This token is expired.
Date Wed, 02 Apr 2014 13:55:19 GMT
thank you, omkar,

I'm fresh to Hadoop, and all the settings are default, so I guess the
expiration is 10 minutes.

The exception happens when running big job, which occupies all the
resources of all nodes.

When running small job, with many containers remained, no exception was
thrown.


Actually I didn't quite follow you, what "reservation" means,
I guess you mean RM creates the token at the time of reservation, but when
it assigns the container to AM, the token is expired.
Is this correct?

Can I ask you a favor to help me find the jira? or tell me which version
fixed the problem?

Thanks!

2014-03-30 0:33 GMT+08:00 omkar joshi <omkar.vinit.joshi.86@gmail.com>:

> Can you check few things?
> What is the container expiry interval set to?
> How many containers are getting allocated?
> Is there any reservation of the containers happening..?
> if yes then that was a known problem...I don't remember the jira number
> though... Underlying problem in case of reservation was that it creates a
> token at the time of reservation and not when it issues the token to AM.
>
>
>
> On Fri, Mar 28, 2014 at 6:03 AM, Leibnitz <se3g2011@gmail.com> wrote:
>
>> no doubt
>>
>> Sent from my iPhone 6
>>
>> > On Mar 23, 2014, at 17:37, Fengyun RAO <raofengyun@gmail.com> wrote:
>> >
>> > What does this exception mean? I googled a lot, all the results tell me
>> it's because the time is not synchronized between datanode and namenode.
>> > However, I checked all the servers, that the ntpd service is on, and
>> the time differences are less than 1 second.
>> > What's more, the tasks are not always failing on certain datanodes.
>> > It fails and then it restarts and succeeds. If it were the time
>> problem, I guess it would always fail.
>> >
>> > My hadoop version is CDH5 beta. Below is the detailed log:
>> >
>> > 14/03/23 14:57:06 INFO mapreduce.Job: Running job:
>> job_1394434496930_0032
>> > 14/03/23 14:57:17 INFO mapreduce.Job: Job job_1394434496930_0032
>> running in uber mode : false
>> > 14/03/23 14:57:17 INFO mapreduce.Job:  map 0% reduce 0%
>> > 14/03/23 15:08:01 INFO mapreduce.Job: Task Id :
>> attempt_1394434496930_0032_m_000034_0, Status : FAILED
>> > Container launch failed for container_1394434496930_0032_01_000041 :
>> org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to
>> start container.
>> > This token is expired. current time is 1395558481146 found 1395558443384
>> >        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> >        at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> >        at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> >        at
>> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> >        at
>> org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152)
>> >        at
>> org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
>> >        at
>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
>> >        at
>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:370)
>> >        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >        at java.lang.Thread.run(Thread.java:724)
>> >
>> > 14/03/23 15:08:02 INFO mapreduce.Job:  map 1% reduce 0%
>> > 14/03/23 15:09:36 INFO mapreduce.Job: Task Id :
>> attempt_1394434496930_0032_m_000036_0, Status : FAILED
>> > Container launch failed for container_1394434496930_0032_01_000038 :
>> org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to
>> start container.
>> > This token is expired. current time is 1395558575889 found 1395558443245
>> >        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> >        at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> >        at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> >        at
>> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> >        at
>> org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152)
>> >        at
>> org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
>> >        at
>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
>> >        at
>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:370)
>> >        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >        at java.lang.Thread.run(Thread.java:724)
>> >
>>
>
>

Mime
View raw message