flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zentol <...@git.apache.org>
Subject [GitHub] flink pull request: [FLINK-3332] Cassandra connector
Date Wed, 10 Feb 2016 15:19:38 GMT
GitHub user zentol opened a pull request:


    [FLINK-3332] Cassandra connector

    This PR adds an Exactly-Once Cassandra Sink.
    The Exactly-once guarantee is made by saving incoming records in the OperatorState, and
only committing them into Cassandra when a checkpoint completes. Note that a job failure while
the data is being committed will cause duplicate data to be committed, but the chance of this
happening is much smaller than for a naive At-Least-once implementation.
    The CassandraExactlyOnceSink is implemented as a custom operator to get access to the
Statebackend. Values are committed with single inserts using a PreparedStatement that is supplied
by the user, similiar to the Batch JDBC-Outputformat.
    The Exactly-Once logic is completely contained in a GenericExactlyOnceSink class that
can be used by virtually every sink, requiring no knowledge about the checkpointing mechamism.
    The CassandraExactlyOnceSink and GenericExactlyOnceSink are covered by tests that use
the OneInputStreamTaskHarness to generate a task environment, verifying that stored data is
discarded when a state is restored; all data is being committed when a notify is missed; and
of course that everything works when nothing goes wrong.
    Note: This PR currently subsumes #1609 (the change to ResultPartitionWriter), so that
the tests run properly.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 3332_cassandra

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1620
commit 64f0b32c9292f1c5957badbcee30476b663eb5a1
Author: zentol <s.motsu@web.de>
Date:   2016-02-10T13:14:18Z

    [FLINK-3332] Cassandra connector


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message