flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tae-Geon Um <taegeo...@gmail.com>
Subject Re: Code related to spilling data to disk
Date Wed, 22 Jun 2016 11:39:02 GMT
Thank you for your answer to my question, Chiwan :)  
Can I ask another question?  


> On Jun 22, 2016, at 7:22 PM, Chiwan Park <chiwanpark@apache.org> wrote:
> 
> Hi Tae-Geon,
> 
> AFAIK, spilling *data* to disk happens only when managed memory is used. Currently, streaming
API (DataStream) doesn’t use managed memory yet. `MutableHashTable` is one of representative
usage of managed memory with disk spilling. Note that some special structures such as `CompactingHashTable`
doesn’t spill data to disk even though they use the manage memory to achieve high performance.

As far as I understand, spilling data is only performed on batch mode. 
Do you know why streaming mode does not use managed memory? 
Is this because the performance gain is negligible?

> 
> About spilling *states*, I think that it depends on how state backends is implemented.
For example, `FsStateBackend` saves states to file system but `MemoryStateBackend` doesn’t.
`RocksDBStateBackend` uses memory first and also can spill states to disk.

I’ve found a nice document on the state backend [1]. I will take a look at this doc to know
the detail. 
Thanks! 

Taegeon

[1]: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#state-backends
<https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#state-backends>

> 
> Regards,
> Chiwan Park
> 
>> On Jun 22, 2016, at 3:27 PM, Tae-Geon Um <taegeonum@gmail.com> wrote:
>> 
>> I have another question. 
>> Is the spilling only executed on batch mode? 
>> What happen on streaming mode?  
>> 
>>> On Jun 22, 2016, at 1:48 PM, Tae-Geon Um <taegeonum@gmail.com> wrote:
>>> 
>>> Hi, all
>>> 
>>> As far as I know, Flink spills data (states?) to disk if the data exceeds memory
threshold or there exists memory pressure.
>>> i’d like to know the detail of how Flink spills data to disk. 
>>> 
>>> Could you please let me know which codes do I have to investigate? 
>>> 
>>> Thanks,
>>> Taegeon
>> 
> 


Mime
View raw message