beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aviem Zur (JIRA)" <>
Subject [jira] [Commented] (BEAM-1581) JSON source and sink
Date Fri, 07 Apr 2017 04:47:41 GMT


Aviem Zur commented on BEAM-1581:

We probably won't solve the entire space in one commit, as you said - we'll start with what
is currently needed by users and the community will add the ones they need for their use cases
as we go along.
What we should do is provide good foundation for this in the initial commit, an API for sink/source
which could have other use-cases added to it in the future.

Let's keep in mind that this is a Beam extension and not part of its core.

There is still the question of whether we solve this with Jackson and call it JsonIO and have
the users deal with the fact that it uses Jackson, or call it JacksonIO, which hints to the
fact that if a user wishes to use another framework, they can and add a new implementation
and call it xxxIO. I'm for the latter, calling this JacksonIO explicitly.

> JSON source and sink
> --------------------
>                 Key: BEAM-1581
>                 URL:
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-extensions
>            Reporter: Aviem Zur
>            Assignee: Aviem Zur
> JSON source and sink to read/write JSON files.
> Similarly to {{XmlSource}}/{{XmlSink}}, these be a {{JsonSource}}/{{JonSink}} which are
a {{FileBaseSource}}/{{FileBasedSink}}.
> Consider using methods/code (or refactor these) found in {{AsJsons}} and {{ParseJsons}}
> The {{PCollection}} of objects the user passes to the transform should be embedded in
a valid JSON file
> The most common pattern for this is a large object with an array member which holds all
the data objects and other members for metadata.
> Examples of public JSON APIs:
> Another pattern used is a file which is simply a JSON array of objects.

This message was sent by Atlassian JIRA

View raw message