flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8639) Fix always need to seek multiple times when iterator RocksDBMapState
Date Wed, 21 Feb 2018 13:04:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371379#comment-16371379

ASF GitHub Bot commented on FLINK-8639:

Github user StefanRRichter commented on the issue:

    While `128` is maybe already up to 128x better, a scientific approach for finding the
constant would be to try out for very small entries from which point you have diminishing
returns for higher values. Even for pretty large entries 128 should still be small in total
terms. Maybe we can go higher, or maybe that is already diminishing returns. If you want,
you could measure it. If you don't have the time, then 128 maybe ok for now.

> Fix always need to seek multiple times when iterator RocksDBMapState
> --------------------------------------------------------------------
>                 Key: FLINK-8639
>                 URL: https://issues.apache.org/jira/browse/FLINK-8639
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.4.0
>            Reporter: Sihua Zhou
>            Assignee: Sihua Zhou
>            Priority: Critical
>             Fix For: 1.5.0
> Currently, almost every time we want to iterator a RocksDBMapState we need to do seek
at least 2 times (Seek is a poor performance action for rocksdb cause it can't use the bloomfilter).
This is because `RocksDBMapIterator` use a `cacheEntries` to cache the seek values every time
and the `cacheEntries`'s init size is 1.

This message was sent by Atlassian JIRA

View raw message