flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kruse, Sebastian" <Sebastian.Kr...@hpi.de>
Subject RE: Hardware Requirements
Date Mon, 07 Jul 2014 17:41:29 GMT
Thanks for your answers. Based on what you say, I guess the scaling problem in my program is
the number of data sources. This number is variable and can go beyond 100 (I am analyzing
data dumps). Maybe, the number of shuffles or something similar will grow with the number
of sources or simply because it inflates the plan. That would explain, why the execution fails
for the larger datasets.

I am running 10 TaskManagers. Since these have dual-core CPUs and I thought, I chose 20 as
DOP, and was even thinking about 40 for latency hiding. What DOP would you suggest for this
setting (disregarding the buffer limitation)?
Pertaining to the number of concurrent shuffles, I would also like to know what causes a shuffle.
Reduces, cogroups, and joins? And what about unions?

If you are interested, I can play around a little bit more with the settings  by the end of
this week and report to you, under which circumstances the execution fails or passes. 
(Update: the program just passed with 16000 buffers and a DOP of 10)


-----Original Message-----
From: Ufuk Celebi [mailto:u.celebi@fu-berlin.de] 
Sent: Sonntag, 6. Juli 2014 14:30
To: dev@flink.incubator.apache.org
Subject: Re: Hardware Requirements

Hey Sebastian,

did you already try to increase the number of buffers in accordance to Stephan's suggestion?
The current defaults for the number and size of network buffers are 2048 and 32768 bytes,
resulting in 64 MB of memory for the network buffers.

Out of curiosity: on how many machines are you running your job and what parallelism did you
set for your program? 



On 04 Jul 2014, at 15:46, Kruse, Sebastian <Sebastian.Kruse@hpi.de> wrote:

> Hi everyone,
> I apologize in advance if that is not the right mailing list for my question. If there
is a better place for it, please let me know.
> Basically, I wanted to ask if you have some statement about the hardware requirements
of Flink to process larger amounts of data beginning from, say, 20 GBs. Currently, I am facing
issues in my jobs, e.g., there are not enough buffers for safe execution of some operations.
Since the machines that run my TaskTrackers have unfortunately very limited main memory, I
cannot increase the number of buffers (and heap space in general) too much. Currently, I assigned
them 1.5 GB.
> So, the exact questions are:
> *         Do you have experiences with a suitable HW setup for crunching larger amounts
of data, maybe from the TU cluster?
> *         Are there any configuration tips, you can provide, e.g. pertaining to the buffer
> *         Are there any general statements on the growth of Flink's memory requirements
wrt. to the size of the input data?
> Thanks for your help!
> Sebastian

View raw message