From common-issues-return-156344-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Fri Aug 17 07:22:04 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 15805180627 for ; Fri, 17 Aug 2018 07:22:03 +0200 (CEST) Received: (qmail 68349 invoked by uid 500); 17 Aug 2018 05:22:03 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 68332 invoked by uid 99); 17 Aug 2018 05:22:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Aug 2018 05:22:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 7F88A1806DA for ; Fri, 17 Aug 2018 05:22:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id UnTI_IJu8Dgt for ; Fri, 17 Aug 2018 05:22:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 617FA5F300 for ; Fri, 17 Aug 2018 05:22:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 9E850E0E1D for ; Fri, 17 Aug 2018 05:22:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2BF172468F for ; Fri, 17 Aug 2018 05:22:00 +0000 (UTC) Date: Fri, 17 Aug 2018 05:22:00 +0000 (UTC) From: "Steve Loughran (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-15679) ShutdownHookManager shutdown time needs to be configurable & extended MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-15679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583368#comment-16583368 ] Steve Loughran commented on HADOOP-15679: ----------------------------------------- Thanks for the review. I like your idea of debug level measuring of duration; will do. It'd work best if the supplied hook instances have a to String value which is useful, but at least we can log entry ID and priority. bq. Have you considering per file system (like s3, wasb, etc.) shutdown timeout (passed in when calling ShutdownHookManager#addShutdownHook in FileSystem#getInternal() ) as needed while keep others with a small default value? I think the FS shutdown is set up before any are created, so it's not in a position to ask...it'd get really complex to have it dynamically add it as new entries were added. The FS shutdown duration could be made another config point, independent of all other shutdown hooks, I suppose. Making the base timeout extensible seems like the simplest first step (and least to test/configure/document). > ShutdownHookManager shutdown time needs to be configurable & extended > --------------------------------------------------------------------- > > Key: HADOOP-15679 > URL: https://issues.apache.org/jira/browse/HADOOP-15679 > Project: Hadoop Common > Issue Type: Bug > Components: util > Affects Versions: 2.8.0, 3.0.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Major > Attachments: HADOOP-15679-001.patch, HADOOP-15679-002.patch, HADOOP-15679-002.patch > > > HADOOP-12950 added a timeout on shutdowns to avoid problems with hanging shutdowns. But the timeout is too short for applications where a large flush of data is needed on shutdown. > A key example of this is Spark apps which save their history to object stores, where the file close() call triggers an upload of the final local cached block of data (could be 32+MB), and then execute the final mutipart commit. > Proposed > # make the default sleep time 30s, not 10s > # make it configurable with a time duration property (with minimum time of 1s.?) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org