Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7278E200C46 for ; Wed, 29 Mar 2017 21:05:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6F5F9160B95; Wed, 29 Mar 2017 19:05:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B1AE5160B5D for ; Wed, 29 Mar 2017 21:05:45 +0200 (CEST) Received: (qmail 20354 invoked by uid 500); 29 Mar 2017 19:05:44 -0000 Mailing-List: contact commits-help@beam.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.apache.org Delivered-To: mailing list commits@beam.apache.org Received: (qmail 20342 invoked by uid 99); 29 Mar 2017 19:05:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Mar 2017 19:05:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 0EAE41A0802 for ; Wed, 29 Mar 2017 19:05:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 7lNJJOF1ZLPv for ; Wed, 29 Mar 2017 19:05:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id A72025FB43 for ; Wed, 29 Mar 2017 19:05:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 050B1E06CC for ; Wed, 29 Mar 2017 19:05:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id AB71E24170 for ; Wed, 29 Mar 2017 19:05:41 +0000 (UTC) Date: Wed, 29 Mar 2017 19:05:41 +0000 (UTC) From: "Chamikara Jayalath (JIRA)" To: commits@beam.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (BEAM-778) Make fileio._CompressedFile seekable. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 29 Mar 2017 19:05:46 -0000 [ https://issues.apache.org/jira/browse/BEAM-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947718#comment-15947718 ] Chamikara Jayalath commented on BEAM-778: ----------------------------------------- Currently this is not an issue since Beam FileBasedSoure and FileBasedSink are the only users of CompressedFile/File objects and they are used in a pretty straightforward way where each FileBasedSource/FileBasedSink object owns it's File/CompressedFile object and reading is done using a single thread. A secondary thread that performs dynamic work rebalancing might execute seek() operations for File objects but not for CompressedFile objects. In the future we might have other places where we access CompressedFile objects using multiple thread but I think we should probably wait till such needs arise. Also it might be enough to declare CompressedFile objects to be not thread safe and expect users to address thread safety instead of embedding a lock in CompressedFile objects which would potentially add a performance penalty for all users. WDYT ? > Make fileio._CompressedFile seekable. > ------------------------------------- > > Key: BEAM-778 > URL: https://issues.apache.org/jira/browse/BEAM-778 > Project: Beam > Issue Type: Improvement > Components: sdk-py > Reporter: Chamikara Jayalath > Assignee: Tibor Kiss > Fix For: Not applicable > > > We have a TODO to make fileio._CompressedFile seekable. > https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L692 > Without this, compressed file objects produce for FileBasedSource implementations may not be able to use libraries that utilize methods seek() and tell(). > For example tarfile.open(). -- This message was sent by Atlassian JIRA (v6.3.15#6346)