From common-issues-return-149144-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Fri Mar 2 23:24:04 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 45BA618062F for ; Fri, 2 Mar 2018 23:24:04 +0100 (CET) Received: (qmail 4048 invoked by uid 500); 2 Mar 2018 22:24:03 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 4022 invoked by uid 99); 2 Mar 2018 22:24:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Mar 2018 22:24:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B1666C0ADB for ; Fri, 2 Mar 2018 22:24:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.511 X-Spam-Level: X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id TdTOqUriXKqQ for ; Fri, 2 Mar 2018 22:24:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 2E98B5F126 for ; Fri, 2 Mar 2018 22:24:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 595BAE018B for ; Fri, 2 Mar 2018 22:24:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 134812476D for ; Fri, 2 Mar 2018 22:24:00 +0000 (UTC) Date: Fri, 2 Mar 2018 22:24:00 +0000 (UTC) From: "Aaron Fabbri (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-13761) S3Guard: implement retries for DDB failures and throttling; translate exceptions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384246#comment-16384246 ] Aaron Fabbri commented on HADOOP-13761: --------------------------------------- Findbugs seems to be smoking crack and/or lambda-challenged. |Return value of S3AReadOpContext.getReadInvoker() ignored, but method has no side effect At S3AInputStream.java:but method has no side effect At S3AInputStream.java:[line 181]| {noformat} S3Object object = context.getReadInvoker().once(text, uri, () -> client.getObject(request));{noformat} > S3Guard: implement retries for DDB failures and throttling; translate exceptions > -------------------------------------------------------------------------------- > > Key: HADOOP-13761 > URL: https://issues.apache.org/jira/browse/HADOOP-13761 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.0.0-beta1 > Reporter: Aaron Fabbri > Assignee: Aaron Fabbri > Priority: Blocker > Attachments: HADOOP-13761-004-to-005.patch, HADOOP-13761-005-to-006-approx.diff.txt, HADOOP-13761-005.patch, HADOOP-13761-006.patch, HADOOP-13761-007.patch, HADOOP-13761-008.patch, HADOOP-13761-009.patch, HADOOP-13761-010.patch, HADOOP-13761-010.patch, HADOOP-13761-011.patch, HADOOP-13761-012.patch, HADOOP-13761.001.patch, HADOOP-13761.002.patch, HADOOP-13761.003.patch, HADOOP-13761.004.patch > > > Following the S3AFileSystem integration patch in HADOOP-13651, we need to add retry logic. > In HADOOP-13651, I added TODO comments in most of the places retry loops are needed, including: > - open(path). If MetadataStore reflects recent create/move of file path, but we fail to read it from S3, retry. > - delete(path). If deleteObject() on S3 fails, but MetadataStore shows the file exists, retry. > - rename(src,dest). If source path is not visible in S3 yet, retry. > - listFiles(). Skip for now. Not currently implemented in S3Guard. I will create a separate JIRA for this as it will likely require interface changes (i.e. prefix or subtree scan). > We may miss some cases initially and we should do failure injection testing to make sure we're covered. Failure injection tests can be a separate JIRA to make this easier to review. > We also need basic configuration parameters around retry policy. There should be a way to specify maximum retry duration, as some applications would prefer to receive an error eventually, than waiting indefinitely. We should also be keeping statistics when inconsistency is detected and we enter a retry loop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org