flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: Configuring task slots and parallelism for single node Maven executed
Date Mon, 18 Apr 2016 07:48:44 GMT
Hi Prez,

   1.

   the configuration setting taskmanager.numberOfTaskSlots says with how
   many task slots a TaskManager will be started. As a rough rule of thumb,
   set this value to the number of cores of the machine the TM is running on.
   This this link [1] for further information. The configuration value
   parallelism.default is the default parallelism with which a program will
   be executed if the user didn’t specify it via the submission tool or from
   within the program.
   2.

   You can configure the parallelism programmatically by calling
   setParallelism on the ExecutionEnvironment. The GlobalConfiguration
   approach won’t work in a distributed setting.
   3.

   see 1.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.0/concepts/concepts.html#workers-slots-resources

Cheers,
Till
​

On Mon, Apr 18, 2016 at 6:55 AM, Balaji Rajagopalan <
balaji.rajagopalan@olacabs.com> wrote:

> Answered based on my understanding.
>
> On Mon, Apr 18, 2016 at 8:12 AM, Prez Cannady <
> revprez@correlatesystems.com> wrote:
>
>> Some background.
>>
>> I’m running Flink application on a single machine, instrumented by Spring
>> Boot and launched via the Maven Spring Boot plugin. Basically, I’m trying
>> to figure out how much I can squeeze out of a single node processing my
>> task before committing to a cluster solution.
>>
>> Couple of questions.
>>
>>    1. I assume the configuration options taskmanager.numberOfTaskSlots
>>    and parallelism.default pertain to division of work on a single node.
>>    Am I correct? You will running with single instance of task manager
>>    say if you are running in 4 core machine, you can set the parallelism = 4
>>
>>
>>    1. Is there a way to configure these options programmatically instead
>>    of the configuration YAML? Or some Maven tooling that can ingest a properly
>>    formatted Flink config? For the record, I’m currently trying GlobalConfigeration.getConfiguration.setInteger(“<config
>>    option name>”,<config option value>). I am also going to try
>>    supplying them as properties in the pom. I’m preparing some tests to see if
>>    either of these do as I expect, but thought I’d ask in case I’m heading
>>    down a rabbit hole.
>>
>>   I have been using GlobalConfiguration with no issues, but here is one
> thing you have to aware of, in clustered environment, you will have to copy
> over the yaml file in all the nodes, for example I read the file from
> /usr/share/flink/conf and I have sure this file is available in master node
> and task nodes as well.  Why do you want to injest the config from maven
> tool, you can do this main routine in our application code.
>
>>
>>    1.
>>    2. I figure task slots is limited to the number of
>>    processors/cores/whatever available (and the JVM can get at). Is this
>>    accurate?
>>
>> Any feedback would be appreciated.
>>
>> Prez Cannady
>> p: 617 500 3378
>> e: revprez@opencorrelate.org
>> GH: https://github.com/opencorrelate
>> LI: https://www.linkedin.com/in/revprez
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Mime
View raw message