flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 范昂 <fan...@adxdata.com>
Subject unsubscribe
Date Wed, 02 Dec 2015 17:43:26 GMT


发自我的 iPhone

> 在 2015年12月3日,上午1:41,Maximilian Michels <mxm@apache.org> 写道:
> 
> Great. Here is the commit to try out:
> https://github.com/mxm/flink/commit/f49b9635bec703541f19cb8c615f302a07ea88b3
> 
> If you already have the Flink repository, check it out using
> 
> git fetch https://github.com/mxm/flink/
> f49b9635bec703541f19cb8c615f302a07ea88b3 && git checkout FETCH_HEAD
> 
> Alternatively, here's a direct download link to the sources with the
> fix included:
> https://github.com/mxm/flink/archive/f49b9635bec703541f19cb8c615f302a07ea88b3.zip
> 
> Thanks a lot,
> Max
> 
>> On Wed, Dec 2, 2015 at 5:44 PM, Niels Basjes <Niels@basjes.nl> wrote:
>> Sure, just give me the git repo url to build and I'll give it a try.
>> 
>> Niels
>> 
>>> On Wed, Dec 2, 2015 at 4:28 PM, Maximilian Michels <mxm@apache.org> wrote:
>>> 
>>> I mentioned that the exception gets thrown when requesting container
>>> status information. We need this to send a heartbeat to YARN but it is
>>> not very crucial if this fails once for the running job. Possibly, we
>>> could work around this problem by retrying N times in case of an
>>> exception.
>>> 
>>> Would it be possible for you to deploy a custom Flink 0.10.1 version
>>> we provide and test again?
>>> 
>>>> On Wed, Dec 2, 2015 at 4:03 PM, Niels Basjes <Niels@basjes.nl> wrote:
>>>> No, I was just asking.
>>>> No upgrade is possible for the next month or two.
>>>> 
>>>> This week is our busiest day of the year ...
>>>> Our shop is doing about 10 orders per second these days ...
>>>> 
>>>> So they won't upgrade until next January/February
>>>> 
>>>> Niels
>>>> 
>>>> On Wed, Dec 2, 2015 at 3:59 PM, Maximilian Michels <mxm@apache.org>
>>>> wrote:
>>>>> 
>>>>> Hi Niels,
>>>>> 
>>>>> You mentioned you have the option to update Hadoop and redeploy the
>>>>> job. Would be great if you could do that and let us know how it turns
>>>>> out.
>>>>> 
>>>>> Cheers,
>>>>> Max
>>>>> 
>>>>>> On Wed, Dec 2, 2015 at 3:45 PM, Niels Basjes <Niels@basjes.nl>
wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I posted the entire log from the first log line at the moment of
>>>>>> failure
>>>>>> to
>>>>>> the very end of the logfile.
>>>>>> This is all I have.
>>>>>> 
>>>>>> As far as I understand the Kerberos and Keytab mechanism in Hadoop
>>>>>> Yarn
>>>>>> is
>>>>>> that it catches the "Invalid Token" and then (if keytab) gets a new
>>>>>> Kerberos
>>>>>> ticket (or tgt?).
>>>>>> When the new ticket has been obtained it retries the call that
>>>>>> previously
>>>>>> failed.
>>>>>> To me it seemed that this call can fail over the invalid Token yet
it
>>>>>> cannot
>>>>>> be retried.
>>>>>> 
>>>>>> At this moment I'm thinking a bug in Hadoop.
>>>>>> 
>>>>>> Niels
>>>>>> 
>>>>>> On Wed, Dec 2, 2015 at 2:51 PM, Maximilian Michels <mxm@apache.org>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi Niels,
>>>>>>> 
>>>>>>> Sorry for hear you experienced this exception. From a first glance,
>>>>>>> it
>>>>>>> looks like a bug in Hadoop to me.
>>>>>>> 
>>>>>>>> "Not retrying because the invoked method is not idempotent,
and
>>>>>>>> unable
>>>>>>>> to determine whether it was invoked"
>>>>>>> 
>>>>>>> That is nothing to worry about. This is Hadoop's internal retry
>>>>>>> mechanism that re-attempts to do actions which previously failed
if
>>>>>>> that's possible. Since the action is not idempotent (it cannot
be
>>>>>>> executed again without risking to change the state of the execution)
>>>>>>> and it also doesn't track its execution states, it won't be retried
>>>>>>> again.
>>>>>>> 
>>>>>>> The main issue is this exception:
>>>>>>> 
>>>>>>>> org.apache.hadoop.security.token.SecretManager$InvalidToken:
>>>>>>>> Invalid
>>>>>>>> AMRMToken from >appattempt_1443166961758_163901_000001
>>>>>>> 
>>>>>>> From the stack trace it is clear that this exception occurs upon
>>>>>>> requesting container status information from the Resource Manager:
>>>>>>> 
>>>>>>>> at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.yarn.YarnJobManager$$anonfun$handleYarnMessage$1.applyOrElse(YarnJobManager.scala:259)
>>>>>>> 
>>>>>>> Are there any more exceptions in the log? Do you have the complete
>>>>>>> logs available and could you share them?
>>>>>>> 
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Max
>>>>>>> 
>>>>>>> On Wed, Dec 2, 2015 at 11:47 AM, Niels Basjes <Niels@basjes.nl>
>>>>>>> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> We have a Kerberos secured Yarn cluster here and I'm experimenting
>>>>>>>> with
>>>>>>>> Apache Flink on top of that.
>>>>>>>> 
>>>>>>>> A few days ago I started a very simple Flink application
(just
>>>>>>>> stream
>>>>>>>> the
>>>>>>>> time as a String into HBase 10 times per second).
>>>>>>>> 
>>>>>>>> I (deliberately) asked our IT-ops guys to make my account
have a
>>>>>>>> max
>>>>>>>> ticket
>>>>>>>> time of 5 minutes and a max renew time of 10 minutes (yes,
>>>>>>>> ridiculously
>>>>>>>> low
>>>>>>>> timeout values because I needed to validate this
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-2977).
>>>>>>>> 
>>>>>>>> This job is started with a keytab file and after running
for 31
>>>>>>>> hours
>>>>>>>> it
>>>>>>>> suddenly failed with the exception you see below.
>>>>>>>> 
>>>>>>>> I had the same job running for almost 400 hours until that
failed
>>>>>>>> too
>>>>>>>> (I
>>>>>>>> was
>>>>>>>> too late to check the logfiles but I suspect the same problem).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> So in that time span my tickets have expired and new tickets
have
>>>>>>>> been
>>>>>>>> obtained several hundred times.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> The main error I see is that in the process of a ticket expiring
>>>>>>>> and
>>>>>>>> being
>>>>>>>> renewed I see this message:
>>>>>>>> 
>>>>>>>>     Not retrying because the invoked method is not idempotent,
>>>>>>>> and
>>>>>>>> unable
>>>>>>>> to determine whether it was invoked
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Yarn on the cluster is 2.6.0 ( HDP 2.6.0.2.2.4.2-2 )
>>>>>>>> 
>>>>>>>> Flink is version 0.10.1
>>>>>>>> 
>>>>>>>> 
>>>>>>>> How do I fix this?
>>>>>>>> Is this a bug (in either Hadoop or Flink) or am I doing something
>>>>>>>> wrong?
>>>>>>>> Would upgrading Yarn to 2.7.1  (i.e. HDP 2.3) fix this?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Niels Basjes
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 21:30:27,821 WARN  org.apache.hadoop.security.UserGroupInformation
>>>>>>>> - PriviledgedActionException as:nbasjes (auth:SIMPLE)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>>>>>>>> Invalid AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>> 21:30:27,861 WARN  org.apache.hadoop.ipc.Client
>>>>>>>> - Exception encountered while connecting to the server :
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>>>>>>>> Invalid AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>> 21:30:27,861 WARN  org.apache.hadoop.security.UserGroupInformation
>>>>>>>> - PriviledgedActionException as:nbasjes (auth:SIMPLE)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>>>>>>>> Invalid AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>> 21:30:27,891 WARN
>>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler
>>>>>>>> - Exception while invoking class
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate.
>>>>>>>> Not retrying because the invoked method is not idempotent,
and
>>>>>>>> unable
>>>>>>>> to
>>>>>>>> determine whether it was invoked
>>>>>>>> org.apache.hadoop.security.token.SecretManager$InvalidToken:
>>>>>>>> Invalid
>>>>>>>> AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>>      at
>>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>>>>>>> Method)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>>>>>      at
>>>>>>>> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
>>>>>>>>      at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown
>>>>>>>> Source)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>      at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>>>>>>>      at com.sun.proxy.$Proxy14.allocate(Unknown Source)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:245)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.yarn.YarnJobManager$$anonfun$handleYarnMessage$1.applyOrElse(YarnJobManager.scala:259)
>>>>>>>>      at
>>>>>>>> scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>>>>>>>>      at
>>>>>>>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>>>>>>>>      at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:100)
>>>>>>>>      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>>>>>>      at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>>>>>>      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>>>>>>      at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>>>>>>      at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>>>> Caused by:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>>>>>>>> Invalid AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>>      at org.apache.hadoop.ipc.Client.call(Client.java:1406)
>>>>>>>>      at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>>>>      at com.sun.proxy.$Proxy13.allocate(Unknown Source)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>>>>>>>>      ... 29 more
>>>>>>>> 21:30:27,943 ERROR akka.actor.OneForOneStrategy
>>>>>>>> - Invalid AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>> org.apache.hadoop.security.token.SecretManager$InvalidToken:
>>>>>>>> Invalid
>>>>>>>> AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>>      at
>>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>>>>>>> Method)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>>>>>      at
>>>>>>>> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
>>>>>>>>      at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown
>>>>>>>> Source)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>      at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>>>>>>>      at com.sun.proxy.$Proxy14.allocate(Unknown Source)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:245)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.yarn.YarnJobManager$$anonfun$handleYarnMessage$1.applyOrElse(YarnJobManager.scala:259)
>>>>>>>>      at
>>>>>>>> scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>>>>>>>>      at
>>>>>>>> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>>>>>>>>      at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:100)
>>>>>>>>      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>>>>>>      at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>>>>>>      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>>>>>>      at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>>>>>>      at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>>>> Caused by:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>>>>>>>> Invalid AMRMToken from appattempt_1443166961758_163901_000001
>>>>>>>>      at org.apache.hadoop.ipc.Client.call(Client.java:1406)
>>>>>>>>      at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>>>>      at com.sun.proxy.$Proxy13.allocate(Unknown Source)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>>>>>>>>      ... 29 more
>>>>>>>> 21:30:28,075 INFO  org.apache.flink.yarn.YarnJobManager
>>>>>>>> - Stopping JobManager
>>>>>>>> akka.tcp://flink@10.10.200.3:39527/user/jobmanager.
>>>>>>>> 21:30:28,088 INFO
>>>>>>>> org.apache.flink.runtime.executiongraph.ExecutionGraph
>>>>>>>> - Source: Custom Source -> Sink: Unnamed (1/1)
>>>>>>>> (db0d95c11c14505827e696eec7efab94) switched from RUNNING
to
>>>>>>>> CANCELING
>>>>>>>> 21:30:28,113 INFO
>>>>>>>> org.apache.flink.runtime.executiongraph.ExecutionGraph
>>>>>>>> - Source: Custom Source -> Sink: Unnamed (1/1)
>>>>>>>> (db0d95c11c14505827e696eec7efab94) switched from CANCELING
to
>>>>>>>> FAILED
>>>>>>>> 21:30:28,184 INFO  org.apache.flink.runtime.blob.BlobServer
>>>>>>>> - Stopped BLOB server at 0.0.0.0:41281
>>>>>>>> 21:30:28,185 ERROR org.apache.flink.runtime.jobmanager.JobManager
>>>>>>>> - Actor akka://flink/user/jobmanager#403236912 terminated,
>>>>>>>> stopping
>>>>>>>> process...
>>>>>>>> 21:30:28,286 INFO
>>>>>>>> org.apache.flink.runtime.webmonitor.WebRuntimeMonitor
>>>>>>>> - Removing web root dir
>>>>>>>> /tmp/flink-web-e1a44f94-ea6d-40ee-b87c-e3122d5cb9bd
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Best regards / Met vriendelijke groeten,
>>>>>>>> 
>>>>>>>> Niels Basjes
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Best regards / Met vriendelijke groeten,
>>>>>> 
>>>>>> Niels Basjes
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards / Met vriendelijke groeten,
>>>> 
>>>> Niels Basjes
>> 
>> 
>> 
>> 
>> --
>> Best regards / Met vriendelijke groeten,
>> 
>> Niels Basjes
Mime
View raw message