flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: How to make Flink to write less temporary files?
Date Mon, 10 Nov 2014 10:22:56 GMT
Hi!

With 10 nodes and 25 GB on each node, you have 250 GB space to spill
temporary files. You also seem to have roughly the same size in JVM Heap,
out of which Flink can use roughly 2/3.

When you process 1 TB, 250 GB JVM heap and 250 GB temp file space may not
be enough, it is less than the initial data size.

I think you need simply need more disk space for a job like that...

Stephan




On Mon, Nov 10, 2014 at 10:54 AM, Malte Schwarzer <ms@mieo.de> wrote:

> My blobStore fileds are small, but each *.channel file is around 170MB.
> Before I start by Flink job I’ve 25GB free space available in my tmp-dir
> and my taskmanager heap size is currently at 24GB. I’m using a cluster with
> 10 nodes.
>
> Is this enough space to process a 1TB file?
>
> Von: Stephan Ewen <sewen@apache.org>
> Antworten an: <user@flink.incubator.apache.org>
> Datum: Montag, 10. November 2014 10:35
> An: <user@flink.incubator.apache.org>
> Betreff: Re: How to make Flink to write less temporary files?
>
> I would assume that the blobStore fields are rather small (they are only
> jar files so far).
>
> I would look for *.channel files, which are spilled intermediate results.
> They can get pretty large for large jobs.
>

Mime
View raw message