flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Yarn configuration
Date Mon, 27 Jul 2015 09:19:14 GMT
Hi Michele,

the 10506 MB refer to the size of Flink's managed memory whereas the 20992
MB refer to the total amount of TM memory. At start-up, the TM allocates a
fraction of the JVM memory as byte arrays and manages this portion by
itself. The remaining memory is used as regular JVM heap for TM and user
code.

The purpose of the warning is to tell the user, that the memory
configuration might not be optimal. However, this depends of course on the
setup environment and should probably be rephrased to make this more clear.

Cheers, Fabian

2015-07-27 11:07 GMT+02:00 Michele Bertoni <michele1.bertoni@mail.polimi.it>
:

>  I have been able to run 5 tm with -jm 2048 and -tm 20992 and 8 slots each
> but in flink dashboard it says “Flink Managed Memory 10506mb” with an
> exclamation mark saying it is much smaller than the physical memory
> (30105mb)…that’s true but i cannot run the cluster with more than 20992
>
>  thanks
>
>
>
>  Il giorno 27/lug/2015, alle ore 11:02, Michele Bertoni <
> michele1.bertoni@mail.polimi.it> ha scritto:
>
>  Hi Robert,
> thanks for answering, today I have been able to try again: no in an EMR
> configuration with 1 master and 5 core I have 5 active node in the resource
> manager…sounds strange to me: ganglia shows 6 nodes and 1 is always offload
>
>  the total amount of memory is 112.5GB that is actually 22.5 for each of
> the 5
>
>  now i am a little lost because I thought I was running 5 node for 5 tm
> and the 6th (master one) as jm but it seems like I have to use the 5 core
> as both tm and jm
>
>
>
>  btw which is a good parameter for number of buffer?
>
>
>  thanks,
> Best
> michele
>
>
>  Il giorno 24/lug/2015, alle ore 16:38, Robert Metzger <
> rmetzger@apache.org> ha scritto:
>
>  Hi Michele,
>
>  configuring a YARN cluster to allocate all available resources as good
> as possible is sometimes tricky, that is true.
> We are aware of these problems and there are actually the following two
> JIRAs for this:
> https://issues.apache.org/jira/browse/FLINK-937 (Change the YARN Client
> to allocate all cluster resources, if no argument given) --> I think the
> consensus on the issue was give users an option to allocate everything (so
> don't do it by default)
> https://issues.apache.org/jira/browse/FLINK-1288 (YARN ApplicationMaster
> sometimes fails to allocate the specified number of workers)
>
>  How many NodeManager's is YARN reporting in the ResourceManager UI? (in
> "Active Nodes" column) (I suspect 6?)
> How much memory per NodeManager is YARN reporting? (You can see this in
> the "Nodes" page of the RM)
>
>  > I would like to run 5 nodes with 8 slots each, is it correct?
>
>  Yes.
>
>
>  > Then i reduced memories, everything started but i get a runtime error
> of missing buffer
>
>  What exactly is the exception?
> I guess you have to give the system a few more network buffers using the
> taskmanager.network.numberOfBuffers config parameter.
>
>  > Can someone help me syep-by-step in a good configuration for such
> cluster? I think the documentation is really missing details
>
>  When starting Flink on YARN, there are usually some WARN log messages in
> the beginning when the system detects that specified containers will not
> fit in the cluster.
> Also, in the ResourceManager UI, you can see the status of the scheduler.
> This often helps to understand what's going on, resource-wise.
>
>
>
> On Fri, Jul 24, 2015 at 3:58 PM, Michele Bertoni <
> michele1.bertoni@mail.polimi.it> wrote:
>
>> Hi everybody, i need a help on how to configure a yarn cluster
>> I tried a lot of conf but none of them was correct
>>
>> We have a cluster on amazon emr let's say 1manager+5worker all of them
>> are m3.2xlarge then 8 core each and 30 GB of RAM each
>>
>> What is a good configuration for such cluster?
>>
>> I would like to run 5 nodes with 8 slots each, is it correct?
>>
>> Now the problems: by now i run all tests mistakenly using 40 task
>> managers each with 2048MB and 1 slot (at least it was working)
>>
>> Today i found the error and i tried run 5 task manager and setting a
>> default slot in conf-yaml of 8, giving a task manager memory of 23040 (-tm
>> 23040) that is the limit allowed by yarn but i am getting errors: one TM is
>> not running because there is no available memory. it seems like the jm is
>> not using memory from the master but from the nodes (in fact yarn says TM
>> number 5 is missing 2048 that is the memory for the jm)
>>
>> Then i reduced memories, everything started but i get a runtime error of
>> missing buffer
>>
>> Can someone help me syep-by-step in a good configuration for such
>> cluster? I think the documentation is really missing details
>>
>> Thanks a lot
>> Best
>> Michele
>>
>
>
>
>

Mime
View raw message