beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Groh (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BEAM-1725) SparkRunner should deduplicate when an UnboundedSource requires Deduping
Date Wed, 15 Mar 2017 00:53:41 GMT
Thomas Groh created BEAM-1725:
---------------------------------

             Summary: SparkRunner should deduplicate when an UnboundedSource requires Deduping
                 Key: BEAM-1725
                 URL: https://issues.apache.org/jira/browse/BEAM-1725
             Project: Beam
          Issue Type: Bug
          Components: runner-spark
            Reporter: Thomas Groh


The implementation of an Unbounded Read does not inspect the requiresDeduping property of
the source, and as such does not appropriately deduplicate sources that require it.

https://github.com/apache/beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/io/SparkUnboundedSource.java



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message