flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Malte Schwarzer ...@mieo.de>
Subject Re: How to make Flink to write less temporary files?
Date Mon, 10 Nov 2014 16:59:51 GMT
What's the estimated amount of disk space for such a job? Or how can I
calculate it?

Malte

Von:  Stephan Ewen <sewen@apache.org>
Antworten an:  <user@flink.incubator.apache.org>
Datum:  Montag, 10. November 2014 11:22
An:  <user@flink.incubator.apache.org>
Betreff:  Re: How to make Flink to write less temporary files?

Hi!

With 10 nodes and 25 GB on each node, you have 250 GB space to spill
temporary files. You also seem to have roughly the same size in JVM Heap,
out of which Flink can use roughly 2/3.

When you process 1 TB, 250 GB JVM heap and 250 GB temp file space may not be
enough, it is less than the initial data size.

I think you need simply need more disk space for a job like that...

Stephan




On Mon, Nov 10, 2014 at 10:54 AM, Malte Schwarzer <ms@mieo.de> wrote:
> My blobStore fileds are small, but each *.channel file is around 170MB. Before
> I start by Flink job I¹ve 25GB free space available in my tmp-dir and my
> taskmanager heap size is currently at 24GB. I¹m using a cluster with 10 nodes.
> 
> Is this enough space to process a 1TB file?
> 
> Von:  Stephan Ewen <sewen@apache.org>
> Antworten an:  <user@flink.incubator.apache.org>
> Datum:  Montag, 10. November 2014 10:35
> An:  <user@flink.incubator.apache.org>
> Betreff:  Re: How to make Flink to write less temporary files?
> 
> 
> I would assume that the blobStore fields are rather small (they are only jar
> files so far).
> 
> I would look for *.channel files, which are spilled intermediate results. They
> can get pretty large for large jobs.




Mime
View raw message