flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: UnknownHostException during start
Date Thu, 11 May 2017 14:37:26 GMT
Dominique:
Which hadoop release are you using ?

Please pastebin the classpath.

Cheers

On Thu, May 11, 2017 at 7:27 AM, Till Rohrmann <trohrmann@apache.org> wrote:

> Hi Dominique,
>
> I’m not exactly sure but this looks more like a Hadoop or a Hadoop
> configuration problem to me. Could it be that the Hadoop version you’re
> running does not support the specification of multiple KMS servers via
> kms://https@lfrarXXX1.srv.company;lfrarXXX2.srv.company:16000/kms?
>
> Cheers,
> Till
> ​
>
> On Thu, May 11, 2017 at 4:06 PM, Dominique Rondé <
> dominique.ronde@allsecur.de> wrote:
>
>> Dear all,
>>
>> i got some trouble during the start of Flink in a Yarn-Container based
>> on Cloudera. I have a start script like that:
>>
>> slaxxxx:/applvg/home/flink/mvp $ cat run.sh
>> export FLINK_HOME_DIR=/applvg/home/flink/mvp/flink-1.2.0/
>> export FLINK_JAR_DIR=/applvg/home/flink/mvp/cache
>> export YARN_CONF_DIR=/etc/hadoop/conf
>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>
>>
>> /applvg/home/flink/mvp/flink-1.2.0/bin/yarn-session.sh -n 4 -s 3 -st -jm
>> 2048 -tm 2048 -qu root.mr-spark.avp -d
>>
>> If I execute this script it looks like following:
>>
>> sla09037:/applvg/home/flink/mvp $ ./run.sh
>> 2017-05-11 15:13:24,541 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.rpc.address, localhost
>> 2017-05-11 15:13:24,542 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.rpc.port, 6123
>> 2017-05-11 15:13:24,542 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.heap.mb, 256
>> 2017-05-11 15:13:24,543 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.heap.mb, 512
>> 2017-05-11 15:13:24,543 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.numberOfTaskSlots, 1
>> 2017-05-11 15:13:24,543 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.memory.preallocate, false
>> 2017-05-11 15:13:24,543 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: parallelism.default, 1
>> 2017-05-11 15:13:24,543 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.web.port, 8081
>> 2017-05-11 15:13:24,571 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.rpc.address, localhost
>> 2017-05-11 15:13:24,572 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.rpc.port, 6123
>> 2017-05-11 15:13:24,572 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.heap.mb, 256
>> 2017-05-11 15:13:24,572 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.heap.mb, 512
>> 2017-05-11 15:13:24,572 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.numberOfTaskSlots, 1
>> 2017-05-11 15:13:24,572 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.memory.preallocate, false
>> 2017-05-11 15:13:24,572 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: parallelism.default, 1
>> 2017-05-11 15:13:24,572 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.web.port, 8081
>> 2017-05-11 15:13:25,000 INFO
>> org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop
>> user set to flink@COMPANYDE.ROOTDOM.NET (auth:KERBEROS)
>> 2017-05-11 15:13:25,030 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.rpc.address, localhost
>> 2017-05-11 15:13:25,030 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.rpc.port, 6123
>> 2017-05-11 15:13:25,030 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.heap.mb, 256
>> 2017-05-11 15:13:25,030 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.heap.mb, 512
>> 2017-05-11 15:13:25,031 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.numberOfTaskSlots, 1
>> 2017-05-11 15:13:25,031 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: taskmanager.memory.preallocate, false
>> 2017-05-11 15:13:25,031 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: parallelism.default, 1
>> 2017-05-11 15:13:25,031 INFO
>> org.apache.flink.configuration.GlobalConfiguration            - Loading
>> configuration property: jobmanager.web.port, 8081
>> 2017-05-11 15:13:25,050 INFO
>> org.apache.flink.yarn.YarnClusterDescriptor                   - Using
>> values:
>> 2017-05-11 15:13:25,051 INFO
>> org.apache.flink.yarn.YarnClusterDescriptor                   -
>> TaskManager count = 4
>> 2017-05-11 15:13:25,051 INFO
>> org.apache.flink.yarn.YarnClusterDescriptor                   -
>> JobManager memory = 2048
>> 2017-05-11 15:13:25,051 INFO
>> org.apache.flink.yarn.YarnClusterDescriptor                   -
>> TaskManager memory = 2048
>> 2017-05-11 15:13:25,903 WARN
>> org.apache.hadoop.util.NativeCodeLoader                       - Unable
>> to load native-hadoop library for your platform... using builtin-java
>> classes where applicable
>> 2017-05-11 15:13:25,962 WARN
>> org.apache.flink.yarn.YarnClusterDescriptor                   - The
>> configuration directory ('/applvg/home/flink/mvp/flink-1.2.0/conf')
>> contains both LOG4J and Logback configuration files. Please delete or
>> rename one of them.
>> 2017-05-11 15:13:25,972 INFO
>> org.apache.flink.yarn.Utils                                   - Copying
>> from file:/applvg/home/flink/mvp/flink-1.2.0/lib to
>> hdfs://nameservice1/user/flink/.flink/application_1493762518335_0216/lib
>> 2017-05-11 15:13:27,522 INFO
>> org.apache.flink.yarn.Utils                                   - Copying
>> from file:/applvg/home/flink/mvp/flink-1.2.0/conf/log4j.properties to
>> hdfs://nameservice1/user/flink/.flink/application_1493762518
>> 335_0216/log4j.properties
>> 2017-05-11 15:13:27,552 INFO
>> org.apache.flink.yarn.Utils                                   - Copying
>> from file:/applvg/home/flink/mvp/flink-1.2.0/conf/logback.xml to
>> hdfs://nameservice1/user/flink/.flink/application_1493762518
>> 335_0216/logback.xml
>> 2017-05-11 15:13:27,584 INFO
>> org.apache.flink.yarn.Utils                                   - Copying
>> from
>> file:/applvg/home/flink/mvp/flink-1.2.0/lib/flink-dist_2.11-1.2.0.jar to
>> hdfs://nameservice1/user/flink/.flink/application_1493762518
>> 335_0216/flink-dist_2.11-1.2.0.jar
>> 2017-05-11 15:13:28,508 INFO
>> org.apache.flink.yarn.Utils                                   - Copying
>> from /applvg/home/flink/mvp/flink-1.2.0/conf/flink-conf.yaml to
>> hdfs://nameservice1/user/flink/.flink/application_1493762518
>> 335_0216/flink-conf.yaml
>> 2017-05-11 15:13:28,553 INFO
>> org.apache.flink.yarn.YarnClusterDescriptor                   - Adding
>> delegation token to the AM container..
>> 2017-05-11 15:13:28,563 INFO
>> org.apache.hadoop.hdfs.DFSClient                              - Created
>> HDFS_DELEGATION_TOKEN token 27247 for flink on ha-hdfs:nameservice1
>> Error while deploying YARN cluster: Couldn't deploy Yarn cluster
>> java.lang.RuntimeException: Couldn't deploy Yarn cluster
>>         at
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(A
>> bstractYarnClusterDescriptor.java:421)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnS
>> essionCli.java:620)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYa
>> rnSessionCli.java:476)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYa
>> rnSessionCli.java:473)
>>         at
>> org.apache.flink.runtime.security.HadoopSecurityContext$1.
>> run(HadoopSecurityContext.java:43)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:422)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1656)
>>         at
>> org.apache.flink.runtime.security.HadoopSecurityContext.runS
>> ecured(HadoopSecurityContext.java:40)
>>         at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarn
>> SessionCli.java:473)
>> Caused by: java.lang.IllegalArgumentException:
>> java.net.UnknownHostException: lfrar256.srv.company;lfrar257.srv.company
>>         at
>> org.apache.hadoop.security.SecurityUtil.buildTokenService(Se
>> curityUtil.java:374)
>>         at
>> org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelega
>> tionTokenService(KMSClientProvider.java:823)
>>         at
>> org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelega
>> tionTokens(KMSClientProvider.java:779)
>>         at
>> org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExten
>> sion.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
>>         at
>> org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTo
>> kens(DistributedFileSystem.java:2046)
>>         at
>> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokens
>> ForNamenodesInternal(TokenCache.java:121)
>>         at
>> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokens
>> ForNamenodesInternal(TokenCache.java:100)
>>         at
>> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokens
>> ForNamenodes(TokenCache.java:80)
>>         at org.apache.flink.yarn.Utils.setTokensFor(Utils.java:154)
>>         at
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployIn
>> ternal(AbstractYarnClusterDescriptor.java:753)
>>         at
>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(A
>> bstractYarnClusterDescriptor.java:419)
>>         ... 9 more
>> Caused by: java.net.UnknownHostException:
>> lfrarXXX1.srv.company;lfrarXXX2.srv.company
>>         ... 20 more
>>
>> It seems that flink found these hosts here:
>> slaxxxxx:/applvg/home/flink/mvp $ grep -r
>> "lfrarXXX1.srv.company;lfrarXXX2.srv.company" /etc/hadoop/conf
>> /etc/hadoop/conf/core-site.xml:
>> <value>kms://https@lfrarXXX1.srv.company;lfrarXXX2.srv.compa
>> ny:16000/kms</value>
>> /etc/hadoop/conf/hdfs-site.xml:
>> <value>kms://https@lfrarXXX1.srv.company;lfrarXXX2.srv.compa
>> ny:16000/kms</value>
>>
>> So I guess that flink got this connectionstrings from the
>> Cloudera-Config and "forget" to split it at the ";". So if i ping each
>> of those everything is working.
>>
>> Maybe you have some hints to avoid this problem?
>>
>> Best wishes
>> Dominiuqe
>>
>>
>

Mime
View raw message