flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominique Rondé <dominique.ro...@allsecur.de>
Subject Re: Problems submitting Flink to Yarn with Kerberos
Date Tue, 30 May 2017 14:41:05 GMT
Hi Gordon,

we use Flink Flink 1.2.0 bundled with Hadoop 2.6 and Scala 2.11 build on
2017-02-02.

Cheers

Dominique


Am 30.05.2017 um 16:31 schrieb Tzu-Li (Gordon) Tai:
> Hi Dominique,
>
> Could you tell us the version / build commit of Flink that you’re using?
>
> Cheers,
> Gordon
>
>
> On 30 May 2017 at 4:29:08 PM, Dominique Rondé
> (dominique.ronde@allsecur.de <mailto:dominique.ronde@allsecur.de>) wrote:
>
>> Hi folks,
>>
>> I just become into the need to bring Flink into a yarn system, that
>> is configured with kerberos. According to the documentation, I
>> changed the flink.conf.yaml like that:
>>
>> security.kerberos.login.use-ticket-cache: true
>> security.kerberos.login.contexts: Client
>>
>> I know that providing a keytab is the prefered, but I have to do a
>> special request to receive one. ;-)
>>
>> After startup, the provisionent is stopped by this error:
>>
>> 2017-05-30 16:16:48,684 INFO 
>> org.apache.flink.yarn.YarnClusterClient                       -
>> Waiting until all TaskManagers have connected
>> Waiting until all TaskManagers have connected
>> 2017-05-30 16:16:48,685 INFO 
>> org.apache.flink.yarn.YarnClusterClient                       -
>> Starting client actor system.
>> 2017-05-30 16:16:52,099 WARN 
>> org.apache.flink.runtime.net.ConnectionUtils                  - Could
>> not connect to lfrar255.srv.allianz/10.17.24.162:56659. Selecting a
>> local address using heuristics.
>> 2017-05-30 16:16:52,473 INFO 
>> akka.event.slf4j.Slf4jLogger                                  -
>> Slf4jLogger started
>> 2017-05-30 16:16:52,512 INFO 
>> Remoting                                                      -
>> Starting remoting
>> 2017-05-30 16:16:52,670 INFO 
>> Remoting                                                      -
>> Remoting started; listening on addresses
>> :[akka.tcp://flink@sla09037.srv.allianz:34579]
>> Exception in thread "main" java.lang.RuntimeException: Unable to get
>> ClusterClient status from Application Client
>>         at
>> org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)
>>         at
>> org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:520)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:660)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
>>         at
>> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:422)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>>         at
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
>> Caused by:
>> org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException:
>> Could not retrieve the leader gateway
>>         at
>> org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:141)
>>         at
>> org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:691)
>>         at
>> org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:242)
>>         ... 10 more
>> Caused by: java.util.concurrent.TimeoutException: Futures timed out
>> after [10000 milliseconds]
>>         at
>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>         at
>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>>         at
>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
>>         at
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>>         at scala.concurrent.Await$.result(package.scala:190)
>>         at scala.concurrent.Await.result(package.scala)
>>         at
>> org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:139)
>>         ... 12 more
>> 2017-05-30 16:17:02,690 INFO 
>> org.apache.flink.yarn.YarnClusterClient                       -
>> Shutting down YarnClusterClient from the client shutdown hook
>> 2017-05-30 16:17:02,691 INFO 
>> org.apache.flink.yarn.YarnClusterClient                       -
>> Disconnecting YarnClusterClient from ApplicationMaster
>> 2017-05-30 16:17:03,693 INFO 
>> akka.remote.RemoteActorRefProvider$RemotingTerminator         -
>> Shutting down remote daemon.
>> 2017-05-30 16:17:03,696 INFO 
>> akka.remote.RemoteActorRefProvider$RemotingTerminator         -
>> Remote daemon shut down; proceeding with flushing remote transports.
>> 2017-05-30 16:17:03,744 INFO 
>> akka.remote.RemoteActorRefProvider$RemotingTerminator         -
>> Remoting shut down.
>>  
>> Has anyone an idea what is going wrong?
>>
>> Best wished
>>
>> Dominique
>>


Mime
View raw message