beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Sela (JIRA)" <>
Subject [jira] [Created] (BEAM-17) Add support for new Dataflow Source API
Date Mon, 15 Feb 2016 18:17:18 GMT
Amit Sela created BEAM-17:

             Summary: Add support for new Dataflow Source API
                 Key: BEAM-17
             Project: Beam
          Issue Type: Improvement
          Components: runner-spark
            Reporter: Amit Sela
            Assignee: Amit Sela

The API is discussed in

To implement this, we need to add support for in TransformTranslator.
This can be done by creating a new SourceInputFormat class that translates from a DF Source
to a Hadoop InputFormat. The two concepts are pretty-well aligned since they both have the
concept of splits and readers.

Note that when there's a native HadoopSource in DF, it will need special-casing in the code
for Read since we'll be able to use the underlying InputFormat directly.

This could be tested using XmlSource from the SDK.

This message was sent by Atlassian JIRA

View raw message