spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zsxwing <...@git.apache.org>
Subject [GitHub] spark pull request #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 sour...
Date Thu, 23 Aug 2018 18:17:19 GMT
GitHub user zsxwing opened a pull request:

    https://github.com/apache/spark/pull/22207

    [SPARK-25214][SS]Fix the issue that Kafka v2 source may return duplicated records when
`failOnDataLoss=false`

    ## What changes were proposed in this pull request?
    
    When there are missing offsets, Kafka v2 source may return duplicated records when `failOnDataLoss=false`.
    
    This PR fixes the issue and also adds regression tests for all Kafka readers.
    
    ## How was this patch tested?
    
    New tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zsxwing/spark SPARK-25214

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22207.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22207
    
----
commit f2d4d67c765a298d23964b26ec07596839f008fa
Author: Shixiong Zhu <zsxwing@...>
Date:   2018-08-23T17:46:52Z

    Fix the issue that Kafka v2 source may return duplicated records when  is

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message