flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Montesino <manuel.montes...@piksel.com>
Subject Re: Problems with taskmanagers in Mesos Cluster
Date Wed, 25 Oct 2017 09:27:22 GMT
Hi Eron,


Thanks for your response.


Maybe I'm not explaining well. The thing is that when we redepoy a flink session, not kill
or stop the active taskmanagers and create/start new ones (those with new configuration),
that's what we want (a full redeploy) so there are not recovered TM, still the sames with
same configuration.


If we change the zk high availability name, the TK will be orphans in Mesos, creating a new
ones and we don't want that.


Another thing is the way we are re-deploying. We have developed an script to deploy flink
jobs from flink's api (we have a pipeline to do all this operations), in this script we stop/kill
the session with /cancel or /cancel-with-savepoint api methods.


Maybe is clear now?.


Thanks in advance.


Manuel Montesino
Devops Engineer

E manuel.montesino@piksel(dot)com

Marie Curie,1. Ground Floor. Campanillas, Malaga 29590
liberating viewing | piksel.com

[Piksel_Email.png]
________________________________
De: Eron Wright <eronwright@gmail.com>
Enviado: lunes, 23 de octubre de 2017 19:03:50
Para: Manuel Montesino
Cc: user@flink.apache.org; Product-Flow
Asunto: Re: Problems with taskmanagers in Mesos Cluster

If I understand you correctly, the high-availability path isn't being changed but other TM-related
settings are, and the recovered TMs aren't picking up the new configuration.   I don't think
that Flink supports on-the-fly reconfiguration of a Task Manager at this time.

As a workaround, to achieve a clean new session when you reconfigure Flink via Marathon, update
the HA path accordingly.

Would that work for you?



On Wed, Oct 18, 2017 at 6:52 AM, Manuel Montesino <manuel.montesino@piksel.com<mailto:manuel.montesino@piksel.com>>
wrote:
Hi,

We have deployed a Mesos cluster with Marathon, we deploy flink sessions through marathon
with multiple taskmanagers configured. Some times in previous stages usually change configuration
on marathon json about memory and other stuff, but when redeploy the flink session the jobmanagers
stop and start with new configuration, but the taskmanagers not reuse the same was configured.
So we have to kill/stop the dockers of each taskmanager task.

There is a way that kill or stop the taskmanagers when the session is redeployed?

Some environment configuration from marathon json file related to taskmanagers:

```
"flink_akka.ask.timeout": "1min",
"flink_akka.framesize": "102400k",
"flink_high-availability": "zookeeper",
"flink_high-availability.zookeeper.path.root": "/flink",
"flink_jobmanager.web.history": "200",
"flink_mesos.failover-timeout": "86400",
"flink_mesos.initial-tasks": "16",
"flink_mesos.maximum-failed-tasks": "-1",
"flink_mesos.resourcemanager.tasks.container.type": "docker",
"flink_mesos.resourcemanager.tasks.mem": "6144",
"flink_metrics.reporters": "jmx",
"flink_metrics.reporter.jmx.class": "org.apache.flink.metrics.jmx.JMXReporter",
"flink_state.backend": "org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory",
"flink_taskmanager.maxRegistrationDuration": "10 min",
"flink_taskmanager.network.numberOfBuffers": "8192",
"flink_jobmanager.heap.mb": "768",
"flink_taskmanager.debug.memory.startLogThread": "true",
"flink_mesos.resourcemanager.tasks.cpus": "1.3",
"flink_env.java.opts.taskmanager": "-XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:ConcGCThreads=1
-XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50
-XX:MaxMetaspaceFreeRatio=80 -XX:+DisableExplicitGC -Djava.awt.headless=true -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10M",
"flink_containerized.heap-cutoff-ratio": "0.67"
```

Thanks in advance and kind regards,

Manuel Montesino
Devops Engineer

E manuel.montesino@piksel(dot)com

Marie Curie,1. Ground Floor. Campanillas, Malaga 29590
liberating viewing | piksel.com<http://piksel.com>

[Piksel_Email.png]

This message is private and confidential. If you have received this message in error, please
notify the sender or servicedesk@piksel.com<mailto:servicedesk@piksel.com> and remove
it from your system.

Piksel Inc is a company registered in the United States, 2100 Powers Ferry Road SE, Suite
400, Atlanta, GA 30339<https://maps.google.com/?q=2100+Powers+Ferry+Road+SE,+Suite+400,+Atlanta,+GA+30339&entry=gmail&source=g>


Mime
View raw message