Return-Path: X-Original-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D294D30A for ; Tue, 28 Aug 2012 22:41:08 +0000 (UTC) Received: (qmail 13383 invoked by uid 500); 28 Aug 2012 22:41:08 -0000 Delivered-To: apmail-incubator-crunch-dev-archive@incubator.apache.org Received: (qmail 13318 invoked by uid 500); 28 Aug 2012 22:41:08 -0000 Mailing-List: contact crunch-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: crunch-dev@incubator.apache.org Delivered-To: mailing list crunch-dev@incubator.apache.org Received: (qmail 13145 invoked by uid 99); 28 Aug 2012 22:41:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Aug 2012 22:41:08 +0000 Date: Wed, 29 Aug 2012 09:41:08 +1100 (NCT) From: "Shawn Smith (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: <1887669182.9127.1346193668326.JavaMail.jiratomcat@arcas> In-Reply-To: <546333720.8308.1346184307786.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (CRUNCH-53) AvroFileReaderFactory does not close input files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-53?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Smith updated CRUNCH-53: ------------------------------ Attachment: CRUNCH-53-autoclose.patch I've attached a patch that closes the input files as long as the calling code loops through the entire iterable (until Iterable.hasNext() returns false). This should handle most situations. It doesn't fix the situation where the client doesn't loop through to completion because of an early termination case or an exception being thrown. That's actually the scenario that leads to the jets3t warning in the ticket description. In those cases it will be left to finalizers to close files. > AvroFileReaderFactory does not close input files > ------------------------------------------------ > > Key: CRUNCH-53 > URL: https://issues.apache.org/jira/browse/CRUNCH-53 > Project: Crunch > Issue Type: Bug > Components: IO > Reporter: Shawn Smith > Priority: Minor > Attachments: CRUNCH-53-autoclose.patch > > > The AvroFileReaderFactory read() method does not close its DataFileReader. With the Hadoop NativeS3FileSystem this can lead to the following warning: > org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream: Successfully released HttpMethod in finalize(). You were lucky this time... Please ensure S3 response data streams are always fully consumed or closed. > WARN [2012-08-28 19:26:16,035] org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream: Attempting to release HttpMethod in finalize() as its response data stream has gone out of scope. This attempt will not always succeed and cannot be relied upon! Please ensure S3 response data streams are always fully consumed or closed to avoid HTTP connection starvation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira