Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4E508200C38 for ; Wed, 15 Mar 2017 19:32:50 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 4CE62160B78; Wed, 15 Mar 2017 18:32:50 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 973D3160B60 for ; Wed, 15 Mar 2017 19:32:49 +0100 (CET) Received: (qmail 63198 invoked by uid 500); 15 Mar 2017 18:32:47 -0000 Mailing-List: contact commits-help@beam.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.apache.org Delivered-To: mailing list commits@beam.apache.org Received: (qmail 63189 invoked by uid 99); 15 Mar 2017 18:32:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Mar 2017 18:32:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 9EFF61A0539 for ; Wed, 15 Mar 2017 18:32:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.651 X-Spam-Level: X-Spam-Status: No, score=0.651 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 6jM4_9dDCJlI for ; Wed, 15 Mar 2017 18:32:45 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id B8EBF5FDD9 for ; Wed, 15 Mar 2017 18:32:44 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 68E3DE095C for ; Wed, 15 Mar 2017 18:32:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B8D49243B2 for ; Wed, 15 Mar 2017 18:32:41 +0000 (UTC) Date: Wed, 15 Mar 2017 18:32:41 +0000 (UTC) From: "Aviem Zur (JIRA)" To: commits@beam.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (BEAM-1581) JSON sources and sinks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 15 Mar 2017 18:32:50 -0000 [ https://issues.apache.org/jira/browse/BEAM-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926687#comment-15926687 ] Aviem Zur edited comment on BEAM-1581 at 3/15/17 6:32 PM: ---------------------------------------------------------- I think we should avoid exposing a contract to the user which promises writing JSONs but accepts strings. This is a loose contract which will leave JSON validity up to the user. If the user does not create valid JSON Strings errors can occur. Errors which might be detected very late in the process, possibly only upon an attempt to consume the data in another process (which may belong to a different user as JSON is often used for integration). We definitely need concrete {{JsonSink extends FileBasedSink}} and {{JsonSource extends FileBasedSource}} classes. But these should not be used directly by the user. All common JSON file logic regarding how the file should be constructed (As [~jkff] mentioned this should be better defined) will be in these sink and source, including all file writing/reading related code (Inherited from {{FileBasedSink}} and {{FileBasedSource}}). In order to avoid exposing classes which deal with Strings to the user we need concrete {{PTransform}} classes which deal with objects. The problem is these probably can't exist in a {{JsonIO}} class since it cannot have the transformations from objects to JSON Strings (since there are several ways to implement this). Should these transforms be in a separate class such as {{JacksonIO}} (Similar to {{AvroIO}})? was (Author: aviemzur): I think we should avoid exposing a contract to the user which promises writing JSONs but accepts strings. This is a loose contract which will leave JSON validity up to the user. If the user does not create valid JSON Strings errors can occur. Errors which might be detected very late in the process, possibly only upon an attempt to consume the data in another process (which may belong to a different user as JSON is often used for integration). We definitely need concrete {{JsonSink extends FileBasedSink}} and {{JsonSource extends FileBasedSource}} classes. But these should not be used directly by the user. All common JSON file logic regarding how the file should be constructed (As [~jkff] mentioned this should be better defined) will be in these sink and source, including all file writing/reading related code (Inherited from {{FileBasedSink}} and {{FileBasedSource}}). In order to avoid exposing classes which deal with Strings to the user we need concrete {{PTransform}} classes which deal with objects. The problem is these probably can't exist in a {{JsonIO}} class since it cannot have the transformations from objects to JSON Strings (since there are several ways to implement this). Should these transforms be in a separate class such as {{JacksonIO}}? > JSON sources and sinks > ---------------------- > > Key: BEAM-1581 > URL: https://issues.apache.org/jira/browse/BEAM-1581 > Project: Beam > Issue Type: New Feature > Components: sdk-java-extensions > Reporter: Aviem Zur > Assignee: Aviem Zur > > JSON source and sink to read/write JSON files. > Similarly to {{XmlSource}}/{{XmlSink}}, these be a {{JsonSource}}/{{JonSink}} which are a {{FileBaseSource}}/{{FileBasedSink}}. > Consider using methods/code (or refactor these) found in {{AsJsons}} and {{ParseJsons}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)