flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Congxian Qiu <qcx978132...@gmail.com>
Subject Re: Improved performance when using incremental checkpoints
Date Wed, 17 Jun 2020 13:05:32 GMT
Hi Nick

The result is a bit wired. Did you compare the disk util/performance before
and after enabling checkpoint?

Best,
Congxian


Yun Tang <myasuka@live.com> 于2020年6月17日周三 下午8:56写道:

> Hi Nick
>
> I think this thread use the same program as thread "MapState bad
> performance" talked.
> Please provide a simple program which could reproduce this so that we can
> help you more.
>
> Best
> Yun Tang
> ------------------------------
> *From:* Aljoscha Krettek <aljoscha@apache.org>
> *Sent:* Tuesday, June 16, 2020 19:53
> *To:* user@flink.apache.org <user@flink.apache.org>
> *Subject:* Re: Improved performance when using incremental checkpoints
>
> Hi,
>
> it might be that the operations that Flink performs on RocksDB during
> checkpointing will "poke" RocksDB somehow and make it clean up it's
> internal hierarchies of storage more. Other than that, I'm also a bit
> surprised by this.
>
> Maybe Yun Tang will come up with another idea.
>
> Best,
> Aljoscha
>
> On 16.06.20 12:42, nick toker wrote:
> > Hi,
> >
> > We used both flink versions 1.9.1 and 1.10.1
> > We used rocksDB default configuration.
> > The streaming pipeline is very simple.
> >
> > 1. Kafka consumer
> > 2. Process function
> > 3. Kafka producer
> >
> > The code of the process function is listed below:
> >
> > private transient MapState<String, Object> testMapState;
> >
> > @Override
> >      public void processElement(Map<String, Object> value, Context ctx,
> > Collector<Map<String, Object>> out) throws Exception {
> >
> >              if (testMapState.isEmpty()) {
> >
> >                  testMapState.putAll(value);
> >
> >                  out.collect(value);
> >
> >                  testMapState.clear();
> >              }
> >          }
> >
> > We used the same code with ValueState and observed the same results.
> >
> >
> > BR,
> >
> > Nick
> >
> >
> > ‫בתאריך יום ג׳, 16 ביוני 2020 ב-11:56 מאת ‪Yun Tang‬‏
<‪myasuka@live.com
> > ‬‏>:‬
> >
> >> Hi Nick
> >>
> >> It's really strange that performance could improve when checkpoint is
> >> enabled.
> >> In general, enable checkpoint might bring a bit performance downside to
> >> the whole job.
> >>
> >> Could you give more details e.g. Flink version, configurations of
> RocksDB
> >> and simple code which could reproduce this problem.
> >>
> >> Best
> >> Yun Tang
> >> ------------------------------
> >> *From:* nick toker <nick.toker.dev@gmail.com>
> >> *Sent:* Tuesday, June 16, 2020 15:44
> >> *To:* user@flink.apache.org <user@flink.apache.org>
> >> *Subject:* Improved performance when using incremental checkpoints
> >>
> >> Hello,
> >>
> >> We are using RocksDB as the backend state.
> >> At first we didn't enable the checkpoints mechanism.
> >>
> >> We observed the following behaviour and we are wondering why ?
> >>
> >> When using the rocksDB *without* checkpoint the performance was very
> >> extremely bad.
> >> And when we enabled the checkpoint the performance was improved by a*
> >> factor of 10*.
> >>
> >> Could you please explain if this behaviour is expected ?
> >> Could you please explain why enabling the checkpoint significantly
> >> improves the performance ?
> >>
> >> BR,
> >> Nick
> >>
> >
>
>

Mime
View raw message