From dev-return-48584-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Thu Jan 18 23:45:15 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 108DA180654 for ; Thu, 18 Jan 2018 23:45:15 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 00139160C2B; Thu, 18 Jan 2018 22:45:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 47462160C26 for ; Thu, 18 Jan 2018 23:45:14 +0100 (CET) Received: (qmail 84460 invoked by uid 500); 18 Jan 2018 22:45:13 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 84447 invoked by uid 99); 18 Jan 2018 22:45:13 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Jan 2018 22:45:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id DFF2D18079F for ; Thu, 18 Jan 2018 22:45:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -108.711 X-Spam-Level: X-Spam-Status: No, score=-108.711 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id n8yv1vg_PGHO for ; Thu, 18 Jan 2018 22:45:10 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 0F53E5F58C for ; Thu, 18 Jan 2018 22:45:10 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 81557E0B21 for ; Thu, 18 Jan 2018 22:45:09 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 9B1F82130B for ; Thu, 18 Jan 2018 22:45:05 +0000 (UTC) Date: Thu, 18 Jan 2018 22:45:05 +0000 (UTC) From: "Josh Elser (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (PHOENIX-4537) RegionServer initiating compaction can trigger schema migration and deadlock the system MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Josh Elser created PHOENIX-4537: ----------------------------------- Summary: RegionServer initiating compaction can trigger schema migration and deadlock the system Key: PHOENIX-4537 URL: https://issues.apache.org/jira/browse/PHOENIX-4537 Project: Phoenix Issue Type: Bug Reporter: Romil Choksi Fix For: 5.0.0, 4.14.0 [~sergey.soldatov] has been doing some great digging around a test failure we've been seeing at $dayjob. The situation goes like this. 0. Run some arbitrary load 1. Stop HBase 2. Enable schema mapping ({{phoenix.schema.isNamespaceMappingEnabled=true}} and {{phoenix.schema.mapSystemTablesToNamespace=true}} in hbase-site.xml) 3. Start HBase 4. Circumstantially, have the SYSTEM.CATALOG table need a compaction to run before a client first connects When the RegionServer initiates the compaction, it will end up running {{UngroupedAggregateRegionObserver.clearTsOnDisabledIndexes}} which opens a Phoenix connection. While the RegionServer won't upgrade system tables, it *will* try to migrate them into the schema mapped variants (e.g. SYSTEM.CATALOG to SYSTEM:CATALOG). However, one of the first steps in the schema migration is to disable the SYSTEM.CATALOG table. However, the SYSTEM.CATALOG table can't be disabled until the region is CLOSED, and the region cannot be CLOSED until the compaction is finished. *deadlock* The "obvious" fix is to avoid RegionServers from triggering system table migrations, but Sergey and I both think that this will end badly (RegionServers falling over because they expect the tables to be migrated and they aren't). Thoughts? [~ankit.singhal], [~jamestaylor], any others? -- This message was sent by Atlassian JIRA (v7.6.3#76005)