Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3FB55200C4E for ; Fri, 21 Apr 2017 23:54:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3E471160B97; Fri, 21 Apr 2017 21:54:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 871B3160B86 for ; Fri, 21 Apr 2017 23:54:09 +0200 (CEST) Received: (qmail 8876 invoked by uid 500); 21 Apr 2017 21:54:08 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 8865 invoked by uid 99); 21 Apr 2017 21:54:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Apr 2017 21:54:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 286B7C0370 for ; Fri, 21 Apr 2017 21:54:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id tVhnDKLcQxaY for ; Fri, 21 Apr 2017 21:54:07 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 163AC60D0A for ; Fri, 21 Apr 2017 21:54:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 158FDE0D77 for ; Fri, 21 Apr 2017 21:54:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 7978E21B5E for ; Fri, 21 Apr 2017 21:54:04 +0000 (UTC) Date: Fri, 21 Apr 2017 21:54:04 +0000 (UTC) From: "Andrew Wang (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HADOOP-14028) S3A BlockOutputStreams doesn't delete temporary files in multipart uploads or handle part upload failures MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 21 Apr 2017 21:54:10 -0000 [ https://issues.apache.org/jira/browse/HADOOP-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-14028: --------------------------------- Fix Version/s: 3.0.0-alpha3 > S3A BlockOutputStreams doesn't delete temporary files in multipart uploads or handle part upload failures > --------------------------------------------------------------------------------------------------------- > > Key: HADOOP-14028 > URL: https://issues.apache.org/jira/browse/HADOOP-14028 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 > Affects Versions: 2.8.0 > Environment: JDK 8 + ORC 1.3.0 + hadoop-aws 3.0.0-alpha2 > Reporter: Seth Fitzsimmons > Assignee: Steve Loughran > Priority: Critical > Fix For: 2.8.0, 3.0.0-alpha3 > > Attachments: HADOOP-14028-006.patch, HADOOP-14028-007.patch, HADOOP-14028-branch-2-001.patch, HADOOP-14028-branch-2-008.patch, HADOOP-14028-branch-2-009.patch, HADOOP-14028-branch-2.8-002.patch, HADOOP-14028-branch-2.8-003.patch, HADOOP-14028-branch-2.8-004.patch, HADOOP-14028-branch-2.8-005.patch, HADOOP-14028-branch-2.8-007.patch, HADOOP-14028-branch-2.8-008.patch > > > I have `fs.s3a.fast.upload` enabled with 3.0.0-alpha2 (it's exactly what I was looking for after running into the same OOM problems) and don't see it cleaning up the disk-cached blocks. > I'm generating a ~50GB file on an instance with ~6GB free when the process starts. My expectation is that local copies of the blocks would be deleted after those parts finish uploading, but I'm seeing more than 15 blocks in /tmp (and none of them have been deleted thus far). > I see that DiskBlock deletes temporary files when closed, but is it closed after individual blocks have finished uploading or when the entire file has been fully written to the FS (full upload completed, including all parts)? > As a temporary workaround to avoid running out of space, I'm listing files, sorting by atime, and deleting anything older than the first 20: `ls -ut | tail -n +21 | xargs rm` > Steve Loughran says: > > They should be deleted as soon as the upload completes; the close() call that the AWS httpclient makes on the input stream triggers the deletion. Though there aren't tests for it, as I recall. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org