apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (APEXMALHAR-2086) Kafka Output Operator with Kafka 0.9 API
Date Fri, 08 Jul 2016 01:57:11 GMT

    [ https://issues.apache.org/jira/browse/APEXMALHAR-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367091#comment-15367091

ASF GitHub Bot commented on APEXMALHAR-2086:

GitHub user sandeshh reopened a pull request:


    [APEXMALHAR-2086] Kafka output operator: 0.9.0

    Kafka output exactly once operator and the regular output operator.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sandeshh/apex-malhar APEXMALHAR-2086

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #298
commit 525ce5cca6536c88052e5e8bcc430b6acda06d57
Author: sandeshh <sandesh.hegde@gmail.com>
Date:   2016-05-25T15:56:56Z

    Kafka 0.9.0 output operators and unit tests.
    1. Abstract Base class
    2. Kafka Output operator
    3. Exactly Once output operator
         Key in the Kafka message is used by the operator to track the tuples written by it.


> Kafka Output Operator with Kafka 0.9 API
> ----------------------------------------
>                 Key: APEXMALHAR-2086
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2086
>             Project: Apache Apex Malhar
>          Issue Type: New Feature
>            Reporter: Sandesh
>            Assignee: Sandesh
> Goal : 2 Operartors for Kafka Output
>       1. Simple Kafka Output Operator 
>             - Supports Atleast Once 
>             - Expose most used producer properties as class properties
>       2. Exactly Once Kafka Output ( Not possible in all the cases, will be documented
later )
> Design for Exactly Once
> Window Data Manager - Stores the Kafka partitions offsets.
> Kafka Key - Used by the operator = AppID#OperatorId
> During recovery. Partially written window is re-created using the following  approach:
> Tuples between the largest recovery offsets and the current offset are checked. Based
on the key, tuples written by the other entities are discarded. 
> Only tuples which are not in the recovered set are emitted.
> Tuples needs to be unique within the window.

This message was sent by Atlassian JIRA

View raw message