flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Richter <s.rich...@data-artisans.com>
Subject Re: Frequent Full GC's in case of FSStateBackend
Date Fri, 10 Feb 2017 14:19:50 GMT
Async snapshotting is the default. 

> Am 10.02.2017 um 14:03 schrieb vinay patil <vinay18.patil@gmail.com>:
> 
> Hi Stephan,
> 
> Thank you for the clarification.
> Yes with RocksDB I don't see Full GC happening, also I am using Flink 1.2.0 version and
I have set the statebackend in flink-conf.yaml file to rocksdb, so by default does this do
asynchronous checkpointing or I have to specify it at the job level  ?
> 
> Regards,
> Vinay Patil
> 
> On Fri, Feb 10, 2017 at 4:16 PM, Stefan Richter [via Apache Flink User Mailing List archive.]
<[hidden email] <x-msg://3/user/SendEmail.jtp?type=node&node=11568&i=0>>
wrote:
> Hi,
> 
> FSStateBackend operates completely on-heap and only snapshots for checkpoints go against
the file system. This is why the backend is typically faster for small states, but can become
problematic for larger states. If your state exceeds a certain size, you should strongly consider
to use RocksDB as backend. In particular, RocksDB also offers asynchronous snapshots which
is very valuable to keep stream processing running for large state. RocksDB works on native
memory/disk, so there is no GC to observe. For cases in which your state fits in memory but
GC is a problem you could try using the G1 garbage collector which offers better performance
for the FSStateBackend than the default.
> 
> Best,
> Stefan
> 
> 
>> Am 10.02.2017 um 11:16 schrieb Vinay Patil <[hidden email] <http://user/SendEmail.jtp?type=node&node=11565&i=0>>:
>> 
>> Hi,
>> 
>> I am doing performance test for my pipeline keeping FSStateBackend, I have observed
frequent Full GC's after processing 20M records.
>> 
>> When I did memory analysis using MAT, it showed that the many objects maintained
by Flink state are live.
>> 
>> Flink keeps the state in memory even after checkpointing , when does this state gets
removed / GC. (I am using window operator in which the DTO comes as input)
>> 
>> Also why does Flink keep the state in memory after checkpointing ? 
>> 
>> P.S Using RocksDB is not causing Full GC at all.
>> 
>> Regards,
>> Vinay Patil
> 
> 
> 
> If you reply to this email, your message will be added to the discussion below:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Frequent-Full-GC-s-in-case-of-FSStateBackend-tp11564p11565.html
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Frequent-Full-GC-s-in-case-of-FSStateBackend-tp11564p11565.html>
> To start a new topic under Apache Flink User Mailing List archive., email [hidden email]
<x-msg://3/user/SendEmail.jtp?type=node&node=11568&i=1> 
> To unsubscribe from Apache Flink User Mailing List archive., click here <applewebdata://44410EA9-EF92-479B-904B-47CFE1C83748>.
> NAML <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
> 
> View this message in context: Re: Frequent Full GC's in case of FSStateBackend <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Frequent-Full-GC-s-in-case-of-FSStateBackend-tp11564p11568.html>
> Sent from the Apache Flink User Mailing List archive. mailing list archive <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
at Nabble.com.


Mime
View raw message