flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Biao Liu <mmyy1...@gmail.com>
Subject Re: flink on yarn 启动任务失败
Date Mon, 08 Apr 2019 08:00:51 GMT
Hi,
“Queue's AM resource limit exceeded”
-> 这个应该是 YARN 对 AM 的使用资源进行了限制吧,上限是 4096M 内存?你启动的应该是
job mode 吧,每个 job
都会启动单独的 AM,每个 AM 占用 2048M 内存?如果按这样算的话确实只够启动两个

1900 <575209351@qq.com> 于2019年4月4日周四 下午4:54写道:

> 目前整体采用flink on yarn ha 部署,flink版本为社区版1.7.2,hadoop版本为社区版2.8.5
>
>
> 目前总共有5台flink集群,每台服务器CPU4核,内存8G
>
>
> flink基本配置为
> jobmanager.heap.size: 2048m
> taskmanager.heap.size: 2048m
> taskmanager.numberOfTaskSlots: 4
>
>
> 采用run a job on flink 启动任务,现在每个任务一个并行度
> 命令如 flink run -d -m yarn-cluster  ...
>
>
> 当发布两个任务成功后,第三个任务就启动不了
> 部分启动日志如下
> 360 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           -
> Submitting application master application_1554100483755_0013
> 2019-04-04 16:24:23,389 INFO
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted
> application application_1554100483755_0013
> 2019-04-04 16:24:23,389 INFO
> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Waiting for
> the cluster to be allocated
> 2019-04-04 16:24:23,390 INFO
> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deploying
> cluster, current state ACCEPTED
> 2019-04-04 16:25:23,625 INFO
> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deployment
> took more than 60 seconds. Please check if the requested resources are
> available in the YARN cluster
> 2019-04-04 16:25:23,876 INFO
> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deployment
> took more than 60 seconds. Please check if the requested resources are
> available in the YARN cluster
> 2019-04-04 16:25:24,127 INFO
> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deployment
> took more than 60 seconds. Please check if the requested resources are
> available in the YARN cluster
> 2019-04-04 16:25:24,378 INFO
> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deployment
> took more than 60 seconds. Please check if the requested resources are
> available in the YARN cluster
>
>
>
> 其他找不到任何跟踪信息,查看yarn 控台后,发现容器分配不了,页面上的信息如下
> YarnApplicationState:   ACCEPTED: waiting for AM container to be
> allocated, launched and register with RM.
>
>
> Diagnostics:    [Thu Apr 04 16:33:49 +0800 2019] Application is added to
> the scheduler and is not yet activated.
> Queue's AM resource limit exceeded. Details : AM Partition =
> <DEFAULT_PARTITION>; AM Resource Request = <memory:2048, vCores:1>;
> Queue Resource Limit for AM = <memory:4096, vCores:1>; User AM Resource
> Limit of the queue = <memory:4096, vCores:1>; Queue AM Resource Usage =
> <memory:4096, vCores:2>;
>
>
>
> 1.按照上面的机器划分跟启动设置并行度,还有yarn控台节点查看,还有很多内存跟CPU没有使用到,
> 为什么会出现这种情况,是还需要什么配置吗?
>
> 2.对于上面几个基本配置,jobmanager.heap.size,taskmanager.heap.size,taskmanager.numberOfTaskSlots有什么设置注意点吗?
> 一般要怎么设置?我现在发现这种启动模式下,每个任务都会有一个jobmanager跟一个taskmanger
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message