Return-Path: X-Original-To: apmail-flink-issues-archive@minotaur.apache.org Delivered-To: apmail-flink-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E7CBF18F5B for ; Wed, 10 Feb 2016 15:19:39 +0000 (UTC) Received: (qmail 54263 invoked by uid 500); 10 Feb 2016 15:19:39 -0000 Delivered-To: apmail-flink-issues-archive@flink.apache.org Received: (qmail 54198 invoked by uid 500); 10 Feb 2016 15:19:39 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 54189 invoked by uid 99); 10 Feb 2016 15:19:39 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Feb 2016 15:19:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 33C8E1A01D4 for ; Wed, 10 Feb 2016 15:19:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.253 X-Spam-Level: X-Spam-Status: No, score=-4.253 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.233] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id xsbDyw9bE-Zm for ; Wed, 10 Feb 2016 15:19:38 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with SMTP id 5B76F2050D for ; Wed, 10 Feb 2016 15:19:38 +0000 (UTC) Received: (qmail 54178 invoked by uid 99); 10 Feb 2016 15:19:38 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Feb 2016 15:19:38 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 32DFAE020E; Wed, 10 Feb 2016 15:19:38 +0000 (UTC) From: zentol To: issues@flink.incubator.apache.org Reply-To: issues@flink.incubator.apache.org Message-ID: Subject: [GitHub] flink pull request: [FLINK-3332] Cassandra connector Content-Type: text/plain Date: Wed, 10 Feb 2016 15:19:38 +0000 (UTC) GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/1620 [FLINK-3332] Cassandra connector This PR adds an Exactly-Once Cassandra Sink. The Exactly-once guarantee is made by saving incoming records in the OperatorState, and only committing them into Cassandra when a checkpoint completes. Note that a job failure while the data is being committed will cause duplicate data to be committed, but the chance of this happening is much smaller than for a naive At-Least-once implementation. The CassandraExactlyOnceSink is implemented as a custom operator to get access to the Statebackend. Values are committed with single inserts using a PreparedStatement that is supplied by the user, similiar to the Batch JDBC-Outputformat. The Exactly-Once logic is completely contained in a GenericExactlyOnceSink class that can be used by virtually every sink, requiring no knowledge about the checkpointing mechamism. The CassandraExactlyOnceSink and GenericExactlyOnceSink are covered by tests that use the OneInputStreamTaskHarness to generate a task environment, verifying that stored data is discarded when a state is restored; all data is being committed when a notify is missed; and of course that everything works when nothing goes wrong. Note: This PR currently subsumes #1609 (the change to ResultPartitionWriter), so that the tests run properly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink 3332_cassandra Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1620.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1620 ---- commit 64f0b32c9292f1c5957badbcee30476b663eb5a1 Author: zentol Date: 2016-02-10T13:14:18Z [FLINK-3332] Cassandra connector ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---