flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: How to make Flink to write less temporary files?
Date Mon, 10 Nov 2014 10:39:44 GMT
Hi,

maybe the tmp-dir's are mounted as ramdisks? Some Linux distributions have
this as a default configuration.

On Mon, Nov 10, 2014 at 2:22 AM, Stephan Ewen <sewen@apache.org> wrote:

> Hi!
>
> With 10 nodes and 25 GB on each node, you have 250 GB space to spill
> temporary files. You also seem to have roughly the same size in JVM Heap,
> out of which Flink can use roughly 2/3.
>
> When you process 1 TB, 250 GB JVM heap and 250 GB temp file space may not
> be enough, it is less than the initial data size.
>
> I think you need simply need more disk space for a job like that...
>
> Stephan
>
>
>
>
> On Mon, Nov 10, 2014 at 10:54 AM, Malte Schwarzer <ms@mieo.de> wrote:
>
>> My blobStore fileds are small, but each *.channel file is around 170MB.
>> Before I start by Flink job I’ve 25GB free space available in my tmp-dir
>> and my taskmanager heap size is currently at 24GB. I’m using a cluster with
>> 10 nodes.
>>
>> Is this enough space to process a 1TB file?
>>
>> Von: Stephan Ewen <sewen@apache.org>
>> Antworten an: <user@flink.incubator.apache.org>
>> Datum: Montag, 10. November 2014 10:35
>> An: <user@flink.incubator.apache.org>
>> Betreff: Re: How to make Flink to write less temporary files?
>>
>> I would assume that the blobStore fields are rather small (they are only
>> jar files so far).
>>
>> I would look for *.channel files, which are spilled intermediate results.
>> They can get pretty large for large jobs.
>>
>
>

Mime
View raw message