flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ayush Goyal <ay...@helpshift.com>
Subject Storage options for RocksDBStateBackend
Date Thu, 11 May 2017 08:41:22 GMT
Hello,

I had a few questions regarding checkpoint storage options using
RocksDBStateBackend. In the flink 1.2 documentation, it is the recommended
state
backend due to it's ability to store large states and asynchronous
snapshotting.
For high availabilty it seems HDFS is the recommended store for state
backend
data. In AWS deployment section, it is also mentioned that s3 can be used
for
storing state backend data.

We don't want to depend on a hadoop cluster for flink deployment, so I had
following questions:

1. Can we use any storage backend supported by flink for storing RocksDB
StateBackend data with file urls: there are quite a few supported as
mentioned here:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html
and here:
https://github.com/apache/flink/blob/master/docs/dev/batch/connectors.md

2. Is there some work already done to support Windows Azure Blob Storage for

storing State backend data? There are some docs here:
https://github.com/apache/flink/blob/master/docs/dev/batch/connectors.md
can we utilize this for that?

3. If utilizing S3 for state backend, is there any performance impact?

4. For high availability can we use a NFS volume for state backend, with
"file://" urls? Will there be any performance impact?

PS: I posted this email earlier via nabble, but it's not showing up in
apache archive. So sending again. Apologies if it results in multiple
threads.

-- Ayush

Mime
View raw message