Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D2070200C01 for ; Thu, 19 Jan 2017 20:32:31 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id D0AAB160B57; Thu, 19 Jan 2017 19:32:31 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2ED6E160B3A for ; Thu, 19 Jan 2017 20:32:31 +0100 (CET) Received: (qmail 82505 invoked by uid 500); 19 Jan 2017 19:32:30 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 82492 invoked by uid 99); 19 Jan 2017 19:32:30 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Jan 2017 19:32:30 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 0CF41C186D for ; Thu, 19 Jan 2017 19:32:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.999 X-Spam-Level: X-Spam-Status: No, score=-1.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id YE1-ydfRXe_m for ; Thu, 19 Jan 2017 19:32:28 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 585905F365 for ; Thu, 19 Jan 2017 19:32:28 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 64882E8654 for ; Thu, 19 Jan 2017 19:32:27 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 785F325289 for ; Thu, 19 Jan 2017 19:32:26 +0000 (UTC) Date: Thu, 19 Jan 2017 19:32:26 +0000 (UTC) From: "Eugene Koifman (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 19 Jan 2017 19:32:32 -0000 [ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-13014: ---------------------------------- Attachment: HIVE-13014.07.patch > RetryingMetaStoreClient is retrying too aggresievley > ---------------------------------------------------- > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions > Affects Versions: 1.0.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Priority: Critical > Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, HIVE-13014.03.patch, HIVE-13014.04.patch, HIVE-13014.05.patch, HIVE-13014.06.patch, HIVE-13014.07.patch > > > Not all metastore operations are idempotent. For example, commit_txn() consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, RetryingMetastoreClient will retry the operation thus causing an attempt to commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller (which doesn't know the operation is being retried), while the first attempt has actually succeeded. Thus the caller thinks commit failed and will likely attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)