From hdfs-issues-return-245122-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Fri Dec 7 21:03:04 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3125718067A for ; Fri, 7 Dec 2018 21:03:04 +0100 (CET) Received: (qmail 39296 invoked by uid 500); 7 Dec 2018 20:03:03 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 39285 invoked by uid 99); 7 Dec 2018 20:03:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Dec 2018 20:03:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id ACB6ECC66E for ; Fri, 7 Dec 2018 20:03:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id XTY5xmhk2eIy for ; Fri, 7 Dec 2018 20:03:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id EF62961044 for ; Fri, 7 Dec 2018 20:03:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 8124DE0E3E for ; Fri, 7 Dec 2018 20:03:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1ADAD252CA for ; Fri, 7 Dec 2018 20:03:00 +0000 (UTC) Date: Fri, 7 Dec 2018 20:03:00 +0000 (UTC) From: "Lukas Majercak (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Majercak updated HDFS-14134: ---------------------------------- Description: Currently, some operations that throw IOException on the NameNode are evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail fast. For example, when calling getXAttr("user.some_attr", file") where the file does not have the attribute, NN throws an IOException with message "could not find attr". The current client retry policy determines the action for that to be FAILOVER_AND_RETRY. The client then fails over and retries until it reaches the maximum number of retries. Supposedly, the client should be able to tell that this exception is normal and fail fast. Moreover, even if the action was FAIL, the RetryInvocationHandler looks at all the retry actions from all requests, and FAILOVER_AND_RETRY takes precedence over FAIL action. was: Currently, some operations that throw IOException on the NameNode are evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail fast. For example, when calling getXAttr("user.some_attr", file") where file does not have the attribute, NN throws an IOException with message "could not find attr". The current client retry policy determines the action for that to be FAILOVER_AND_RETRY. The client then fails over and retries until it reaches the maximum number of retries. Supposedly, the client should be able to tell that this exception is normal and fail fast. Moreover, even if the action was FAIL, the RetryInvocationHandler looks at all the retry actions from all requests, and FAILOVER_AND_RETRY takes precedence over FAIL action. > Idempotent operations throwing RemoteException should not be retried by the client > ---------------------------------------------------------------------------------- > > Key: HDFS-14134 > URL: https://issues.apache.org/jira/browse/HDFS-14134 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, hdfs-client, ipc > Reporter: Lukas Majercak > Assignee: Lukas Majercak > Priority: Critical > Attachments: HDFS-14134.001.patch > > > Currently, some operations that throw IOException on the NameNode are evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail fast. > For example, when calling getXAttr("user.some_attr", file") where the file does not have the attribute, NN throws an IOException with message "could not find attr". The current client retry policy determines the action for that to be FAILOVER_AND_RETRY. The client then fails over and retries until it reaches the maximum number of retries. Supposedly, the client should be able to tell that this exception is normal and fail fast. > Moreover, even if the action was FAIL, the RetryInvocationHandler looks at all the retry actions from all requests, and FAILOVER_AND_RETRY takes precedence over FAIL action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org