flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Congxian Qiu <qcx978132...@gmail.com>
Subject Re: Flink大state读取磁盘,磁盘IO打满,任务相互影响的问题探讨
Date Mon, 23 Sep 2019 12:25:38 GMT
Hi

像你描述的,单盘对单任务还存在 IO 瓶颈,这里是单 container 吗?像前面大家说的,你需要确认这么大的
IO
访问是符合预期的,如果符合预期的话,你可以尝试增加 blockcache 和
memtable 的大小,将更多的数据放到内存。

另外,你使用的是什么 state 类型,valuestate 和 liststate 的话,能否换成
mapstate 来处理。同时,你可以看下
rocksdb 的 log,看看是否有什么可以优化的地方


Best,
Congxian


Biao Liu <mmyy1110@gmail.com> 于2019年9月23日周一 下午2:39写道:

> Hello,
>
> IO 量这么大符合预期吗?而且是读硬盘打满。
> 有没有尝试过调优?
> 1. 业务方面的调优,例如对 state 的使用是否合理
> 2. 系统层面的调优,例如 incremental checkpoint [1]
>
> 1.
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/state/checkpointing.html#state-backend-incremental
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Tue, 10 Sep 2019 at 14:39, Wesley Peng <wesley@thepeng.eu> wrote:
>
> >
> >
> > on 2019/9/10 13:47, 蒋涛涛 wrote:
> > > 尝试手段:
> > >
> > > 1. 手动迁移IO比较高的任务到其他机器,但是yarn任务提交比较随机,只能偶尔为之
> > >
> > > 2. 目前没有SSD,只能用普通STATA盘,目前加了两块盘提示磁盘IO能力,但是单盘对单任务的磁盘IO瓶颈还在
> > >
> > > 还有哪些策略可以解决或者缓解么?
> >
> > It seems the tricks to improve RocksDB's throughput might be helpfu.
> >
> > With writes and reads accessing mostly the recent data, our goal is to
> > let them stay in memory as much as possible without using up all the
> > memory on the server. The following parameters are worth tuning:
> >
> > Block cache size: When uncompressed blocks are read from SSTables, they
> > are cached in memory. The amount of data that can be stored before
> > eviction policies apply is determined by the block cache size. The
> > bigger the better.
> >
> > Write buffer size: How big can Memtable get before it is frozen.
> > Generally, the bigger the better. The tradeoff is that big write buffer
> > takes more memory and longer to flush to disk and to recover.
> >
> > Write buffer number: How many Memtables to keep before flushing to
> > SSTable. Generally, the bigger the better. Similarly, the tradeoff is
> > that too many write buffers take up more memory and longer to flush to
> > disk.
> >
> > Minimum write buffers to merge: If most recently written keys are
> > frequently changed, it is better to only flush the latest version to
> > SSTable. This parameter controls how many Memtables it will try to merge
> > before flushing to SSTable. It should be less than the write buffer
> > number. A suggested value is 2. If the number is too big, it takes
> > longer to merge buffers and there is less chance of duplicate keys in
> > that many buffers.
> >
> > The list above is far from being exhaustive, but tuning them correctly
> > can have a big impact on performance. Please refer to RocksDB’s Tuning
> > Guide for more details on these parameters. Figuring out the optimal
> > combination of values for all of them is an art in itself.
> >
> > please ref: https://klaviyo.tech/flinkperf-c7bd28acc67
> >
> > regards.
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message