Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 20F1A10049 for ; Tue, 5 May 2015 13:36:52 +0000 (UTC) Received: (qmail 63086 invoked by uid 500); 5 May 2015 13:36:52 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 62800 invoked by uid 500); 5 May 2015 13:36:51 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 62790 invoked by uid 99); 5 May 2015 13:36:51 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 May 2015 13:36:51 +0000 Date: Tue, 5 May 2015 13:36:51 +0000 (UTC) From: "Jacques Nadeau (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (DRILL-2120) Bringing up multiple drillbits at same time results in synchronization failure MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-2120: ---------------------------------- Fix Version/s: (was: 1.0.0) 1.2.0 > Bringing up multiple drillbits at same time results in synchronization failure > ------------------------------------------------------------------------------ > > Key: DRILL-2120 > URL: https://issues.apache.org/jira/browse/DRILL-2120 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow > Affects Versions: 0.8.0 > Reporter: Ramana Inukonda Nagaraj > Assignee: Steven Phillips > Fix For: 1.2.0 > > > Repro: > With a fresh ZK install bring up 4 drillbits at the same time using something like clush > clush -g ats /opt/drill/bin/drillbit.sh start > Looks like all 4 nodes try to query the ZK to see if the node exists and all of them try to create it at the same time. Some succeed, Others don't. The ones which fail have incorrect information about the state of the ZK and that would explain the below stacktrace. > {code} > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. > Exception in thread "main" org.apache.drill.exec.exception.DrillbitStartupException: Failure during initial startup of Drillbit. > at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:76) > at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:60) > at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:83) > Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper > at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.putIfAbsent(ZkAbstractStore.java:135) > at org.apache.drill.exec.store.StoragePluginRegistry.createPlugins(StoragePluginRegistry.java:150) > at org.apache.drill.exec.store.StoragePluginRegistry.init(StoragePluginRegistry.java:130) > at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:155) > at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:73) > ... 2 more > Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper > at org.apache.drill.exec.store.sys.zk.ZkPStore.createNodeInZK(ZkPStore.java:53) > at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.putIfAbsent(ZkAbstractStore.java:129) > ... 6 more > Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /drill-ats-build/sys.storage_plugins/cp > at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) > at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) > at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) > at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) > at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) > at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44) > at org.apache.drill.exec.store.sys.zk.ZkPStore.createNodeInZK(ZkPStore.java:51) > ... 7 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)