Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id CA1B2200B9D for ; Thu, 29 Sep 2016 03:51:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C8B6E160AD3; Thu, 29 Sep 2016 01:51:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 15524160AD4 for ; Thu, 29 Sep 2016 03:51:22 +0200 (CEST) Received: (qmail 44925 invoked by uid 500); 29 Sep 2016 01:51:22 -0000 Mailing-List: contact commits-help@beam.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.incubator.apache.org Delivered-To: mailing list commits@beam.incubator.apache.org Received: (qmail 44916 invoked by uid 99); 29 Sep 2016 01:51:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Sep 2016 01:51:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id DCCA2CAF34 for ; Thu, 29 Sep 2016 01:51:21 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -7.019 X-Spam-Level: X-Spam-Status: No, score=-7.019 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id qVwdumEvXhL4 for ; Thu, 29 Sep 2016 01:51:21 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id 150235F296 for ; Thu, 29 Sep 2016 01:51:20 +0000 (UTC) Received: (qmail 44818 invoked by uid 99); 29 Sep 2016 01:51:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Sep 2016 01:51:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7430C2C0057 for ; Thu, 29 Sep 2016 01:51:20 +0000 (UTC) Date: Thu, 29 Sep 2016 01:51:20 +0000 (UTC) From: "Jeffrey Payne (JIRA)" To: commits@beam.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (BEAM-55) Allow users to compress FileBasedSink output files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 29 Sep 2016 01:51:24 -0000 [ https://issues.apache.org/jira/browse/BEAM-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15531516#comment-15531516 ] Jeffrey Payne commented on BEAM-55: ----------------------------------- We too prefer to use binary file formats like Avro or Parquet, for many reasons, including automatic compression handling. Unfortunately, we have several existing SLAs with clients that necessitate compressed CSV output, some even require a *single compressed CSV file*, ugh. What they do with the file once it's out of our hands is their problem :) I'll read through the contribution guide, fork beam, and submit a PR. Thanks again for the direction! > Allow users to compress FileBasedSink output files > -------------------------------------------------- > > Key: BEAM-55 > URL: https://issues.apache.org/jira/browse/BEAM-55 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core > Reporter: Daniel Halperin > Priority: Minor > > FileBasedSink (also TextIO.Write, AvroIO.Write, etc). does not have an option for compressing its output. > In general, we discourage compression because it limits or blocks scalably reading from a file in parallel. However, users may want it -- so we should support the option (with appropriate warnings). -- This message was sent by Atlassian JIRA (v6.3.4#6332)