beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Sela (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-549) SparkRunner should support Beam's KafkaIO instead of providing it's own.
Date Fri, 12 Aug 2016 21:49:20 GMT

     [ https://issues.apache.org/jira/browse/BEAM-549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amit Sela updated BEAM-549:
---------------------------
    Description: 
For portability, and in the spirit of Apache Beam, the SparkRunner should use the Beam implementation
of KafkaIO instead of it's own.

Having said that, the runner will translate the KafkaIO as defined in the pipeline into it's
own internal implementation, but should still map the properties the user defined in the pipeline
in a way that the IO behaves the same - i.e., brokers, topic, etc.

Eventually, the SparkRunner will implement reading from Kafka using Spark's KafakUtils.createDirectStream()
as described here: http://spark.apache.org/docs/1.6.2/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers


  was:
For portability, and in the spirit of Apache Beam, the SparkRunner should use the Beam implementation
of KafkaIO instead of it's own.

Having said that, it might choose to translate the KafkaIO as defined in the pipeline into
it's own internal implementation, but should still map the properties the user defined in
the pipeline in a way that the IO behaves the same - i.e., brokers, topic, etc.


> SparkRunner should support Beam's KafkaIO instead of providing it's own.
> ------------------------------------------------------------------------
>
>                 Key: BEAM-549
>                 URL: https://issues.apache.org/jira/browse/BEAM-549
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> For portability, and in the spirit of Apache Beam, the SparkRunner should use the Beam
implementation of KafkaIO instead of it's own.
> Having said that, the runner will translate the KafkaIO as defined in the pipeline into
it's own internal implementation, but should still map the properties the user defined in
the pipeline in a way that the IO behaves the same - i.e., brokers, topic, etc.
> Eventually, the SparkRunner will implement reading from Kafka using Spark's KafakUtils.createDirectStream()
as described here: http://spark.apache.org/docs/1.6.2/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message