flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8790) Improve performance for recovery from incremental checkpoint
Date Fri, 01 Jun 2018 09:56:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497800#comment-16497800
] 

ASF GitHub Bot commented on FLINK-8790:
---------------------------------------

Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5582#discussion_r192350121
  
    --- Diff: flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackendTest.java
---
    @@ -547,4 +549,30 @@ public boolean accept(File file, String s) {
     			return true;
     		}
     	}
    +
    +	private static class TestRocksDBStateBackend extends RocksDBStateBackend {
    +
    +		public TestRocksDBStateBackend(AbstractStateBackend checkpointStreamBackend, boolean
enableIncrementalCheckpointing) {
    +			super(checkpointStreamBackend, enableIncrementalCheckpointing);
    +		}
    +
    +		@Override
    +		public <K> AbstractKeyedStateBackend<K> createKeyedStateBackend(
    +			Environment env,
    +			JobID jobID,
    +			String operatorIdentifier,
    +			TypeSerializer<K> keySerializer,
    +			int numberOfKeyGroups,
    +			KeyGroupRange keyGroupRange,
    +			TaskKvStateRegistry kvStateRegistry) throws IOException {
    +
    +			AbstractKeyedStateBackend<K> keyedStateBackend = super.createKeyedStateBackend(
    +				env, jobID, operatorIdentifier, keySerializer, numberOfKeyGroups, keyGroupRange,
kvStateRegistry);
    +
    +			// We ignore the range deletions on production, but when we are running the tests
we shouldn't ignore it.
    --- End diff --
    
    That sounds good 👍 This also means we do not need any tricks with the read options
:)


> Improve performance for recovery from incremental checkpoint
> ------------------------------------------------------------
>
>                 Key: FLINK-8790
>                 URL: https://issues.apache.org/jira/browse/FLINK-8790
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.5.0
>            Reporter: Sihua Zhou
>            Assignee: Sihua Zhou
>            Priority: Major
>             Fix For: 1.6.0
>
>
> When there are multi state handle to be restored, we can improve the performance as follow:
> 1. Choose the best state handle to init the target db
> 2. Use the other state handles to create temp db, and clip the db according to the target
key group range (via rocksdb.deleteRange()), this can help use get rid of the `key group check`
in 
>  `data insertion loop` and also help us get rid of traversing the useless record.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message