Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 064A5200D36 for ; Mon, 6 Nov 2017 17:29:04 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 046C5160BEC; Mon, 6 Nov 2017 16:29:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4A3A81609E0 for ; Mon, 6 Nov 2017 17:29:03 +0100 (CET) Received: (qmail 28027 invoked by uid 500); 6 Nov 2017 16:29:02 -0000 Mailing-List: contact commits-help@beam.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.apache.org Delivered-To: mailing list commits@beam.apache.org Received: (qmail 28018 invoked by uid 99); 6 Nov 2017 16:29:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Nov 2017 16:29:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C1804180805 for ; Mon, 6 Nov 2017 16:29:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id nK6oos89BgU2 for ; Mon, 6 Nov 2017 16:29:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D55885FDCE for ; Mon, 6 Nov 2017 16:29:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6BBD1E0E7A for ; Mon, 6 Nov 2017 16:29:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2944323F05 for ; Mon, 6 Nov 2017 16:29:00 +0000 (UTC) Date: Mon, 6 Nov 2017 16:29:00 +0000 (UTC) From: "Etienne Chauchot (JIRA)" To: commits@beam.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (BEAM-2993) AvroIO.write without specifying a schema MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 06 Nov 2017 16:29:04 -0000 [ https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240520#comment-16240520 ] Etienne Chauchot commented on BEAM-2993: ---------------------------------------- As the PCollection is not ordered, if one bundle ends up having only SCHEMA1 records and the other only SCHEMA2 records, then guessing the schema lazily at "first" element will write the 2 bundles with no error because it will guess SCHEMA1 from bundle 1 and SCHEMA2 from bundle 2. It will then result in producing an avro file that has 2 schemas which is wrong > AvroIO.write without specifying a schema > ---------------------------------------- > > Key: BEAM-2993 > URL: https://issues.apache.org/jira/browse/BEAM-2993 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions > Reporter: Etienne Chauchot > Assignee: Etienne Chauchot > > Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be able to write to avro files using {{AvroIO}} without specifying a schema at build time. Consider the following use case: a user has a {{PCollection}} but the schema is only known while running the pipeline. {{AvroIO.writeGenericRecords}} needs the schema, but the schema is already available in {{GenericRecord}}. We should be able to call {{AvroIO.writeGenericRecords()}} with no schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)