From hdfs-issues-return-276079-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Wed Aug 7 02:25:03 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id CD06A180595 for ; Wed, 7 Aug 2019 04:25:02 +0200 (CEST) Received: (qmail 57312 invoked by uid 500); 7 Aug 2019 02:25:01 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 57286 invoked by uid 99); 7 Aug 2019 02:25:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Aug 2019 02:25:01 +0000 Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id AE79AE2FC3 for ; Wed, 7 Aug 2019 02:25:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5C86426665 for ; Wed, 7 Aug 2019 02:25:00 +0000 (UTC) Date: Wed, 7 Aug 2019 02:25:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Work logged] (HDDS-1610) applyTransaction failure should not be lost on restart MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDDS-1610?focusedWorklogId=290175&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-290175 ] ASF GitHub Bot logged work on HDDS-1610: ---------------------------------------- Author: ASF GitHub Bot Created on: 07/Aug/19 02:24 Start Date: 07/Aug/19 02:24 Worklog Time Spent: 10m Work Description: bshashikant commented on issue #1226: HDDS-1610. applyTransaction failure should not be lost on restart. URL: https://github.com/apache/hadoop/pull/1226#issuecomment-518913410 Thanks @mukul1987 . In ratis, as fara as my understanding goes, before taking a snapshot we wait for all the pending applyTrannsaction futures to complete and since now with the patch, the applyTransaction exception is being propagated to Ratis, ideally snapshot creation will fail in Ratis. I will address the remaining review comments as part of the next patch. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 290175) Time Spent: 3h 10m (was: 3h) > applyTransaction failure should not be lost on restart > ------------------------------------------------------ > > Key: HDDS-1610 > URL: https://issues.apache.org/jira/browse/HDDS-1610 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Shashikant Banerjee > Assignee: Shashikant Banerjee > Priority: Blocker > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > If the applyTransaction fails in the containerStateMachine, then the container should not accept new writes on restart,. > This can occur if > # chunk write applyTransaction fails > # container state update to UNHEALTHY also fails > # Ratis snapshot is taken > # Node restarts > # container accepts new transactions -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org