flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yassine MARZOUGUI <y.marzou...@mindlytix.com>
Subject Checkpoints very slow with high backpressure
Date Sun, 23 Apr 2017 18:53:45 GMT
Hi all,

I have a Streaming pipeline as follows:
1 - read a folder continuousely from HDFS
2 - filter duplicates (using keyby(x->x) and keeping a state per key
indicating whether its is seen)
3 - schedule some future actions on the stream using ProcessFunction and
processing time timers (elements are kept in a MapState)
4- write results back to HDFS using a BucketingSink.

I am using RocksdbStateBackend, and Flink 1.3-SNAPSHOT (Commit: 9fb074c).

Currenlty the source contain just one a file of 1GB, so that's the maximum
state that the job might hold. I noticed that the backpressure on the
operators #1 and #2 is High, and the split reader has only read 60 Mb out
of 1Gb source source file. I suspect this is because the ProcessFunction is
slow (on purpose). However looks like this affected the checkpoints which
are failing after the timeout (which is set to 2 hours), see attached

In the job manager logs I keep getting warnings :

2017-04-23 19:32:38,827 WARN
org.apache.flink.runtime.checkpoint.CheckpointCoordinator     -
Received late message for now expired checkpoint attempt 8 from
210769a077c67841d980776d8caece0a of job

Is the high backpressure the cause for the checkpoints being too slow? If
yes Is there a way to disbale the backpressure mechanism since the records
will be buffered in the rocksdb state after all which is backed by the disk?

Thank you.


View raw message