Return-Path: X-Original-To: apmail-incubator-drill-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-drill-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4689C1026F for ; Sun, 27 Oct 2013 22:22:35 +0000 (UTC) Received: (qmail 27075 invoked by uid 500); 27 Oct 2013 22:22:34 -0000 Delivered-To: apmail-incubator-drill-user-archive@incubator.apache.org Received: (qmail 26841 invoked by uid 500); 27 Oct 2013 22:22:33 -0000 Mailing-List: contact drill-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: drill-user@incubator.apache.org Delivered-To: mailing list drill-user@incubator.apache.org Delivered-To: moderator for drill-user@incubator.apache.org Received: (qmail 24563 invoked by uid 99); 27 Oct 2013 22:17:52 -0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sphillips@maprtech.com designates 209.85.212.180 as permitted sender) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=dkm9v3tcU2j6vAmDP58exAcv6s5Og3W4+gkSDnkPrmA=; b=aQjgEYJmeFQEY4qDLD/vaKVoRpRWsG9eU+mss557TJtlFoskvvGyQvJcXAolw2W4pP 8xNF3/ByBlvPNi4MZnaqwy5PtBC8mGRJuzd0lJqGG2HPHUU2JLL6ltzZJzC6aBOXDQtf i4mID2ZREJSCq5ltIoG/doDfaBkQlW7aUzBdU13guzLbCDR0v4D1+RWhp0pSVPiju0ZW 4gkU+wml+KcABvHSy9Q8NdmY56jrRldA67OQXzVtFbAkn5nA8rxQ0kY7u4soUfrhSKIm WQPrvrNJ+ZAxcJyX2QcwYFdaXa8b4Pci8vjXKz+V52U+P583nanuiWPT+QAKzb6ZsXVS JumQ== X-Gm-Message-State: ALoCoQmLIvxDi25Toj81iKW27Nff31lBfjv9DRo/n8jbY1wWyBxiLygVGDTV6h5hsnsCgKZ14XuA MIME-Version: 1.0 X-Received: by 10.194.75.165 with SMTP id d5mr15580605wjw.18.1382912246388; Sun, 27 Oct 2013 15:17:26 -0700 (PDT) In-Reply-To: <3A1E02EE-DA78-423C-98B8-8367B1F21BB9@gmail.com> References: <93C17963-04F8-41B6-B2D1-F90473F9DB90@gmail.com> <3A1E02EE-DA78-423C-98B8-8367B1F21BB9@gmail.com> Date: Sun, 27 Oct 2013 15:17:26 -0700 Message-ID: Subject: Re: Distributed mode troubles: ZK/Curator connection time out From: Steven Phillips To: drill-dev@incubator.apache.org Cc: Apache Drill User Content-Type: multipart/alternative; boundary=047d7bb04bc233b23d04e9c059ca X-Virus-Checked: Checked by ClamAV on apache.org --047d7bb04bc233b23d04e9c059ca Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable You need to replace localhost with the hostname of the node running zookeeper. If that zookeeper is configured to use a port different than 2181, then that needs to be set as well. If you have multiple zookeepers in the quorum, you then zk.connect should be a comma separated list of the host:port of each node. The default, localhost setting will only work when a drillbit is running on the same node as the zookeeper. On Sun, Oct 27, 2013 at 2:57 PM, Michael Hausenblas < michael.hausenblas@gmail.com> wrote: > > > One thing to add to the diagram is that all of the drill java processes > will look at what is in drill-override.conf. > > Thanks, done. > > > > You must set zk.connect to the correct zk host:port. > > > Can you be a tad more explicit, please? In drill-override.conf I have > > [[ > =85 > zk: { > connect: "localhost:2181=94, > =85 > ]] > > > What am I overlooking? > > Also, any directions re the rest of my questions (re bin/submit_plan etc.= )? > > > With a little help from here, I=92m happy to put together the descriptio= n > how to set this up in the Wiki, also to address a query we=92ve now lying > around for more than three weeks, by Steve McPherson =96 see > http://mail-archives.apache.org/mod_mbox/incubator-drill-user/201310.mbox= /%3CCE71A20F.14F5B%25stevemp%40amazon.com%3E=96 the fact that it attracted = 0 responses I find slightly embarrassing, and > if I were Steve, I=92d prolly not touch Drill anymore, but let=92s hope f= or the > best =85 > > > Cheers, > Michael > > -- > Michael Hausenblas > Ireland, Europe > http://mhausenblas.info/ > > On 27 Oct 2013, at 21:32, Steven Phillips wrote: > > > One thing to add to the diagram is that all of the drill java processes > > will look at what is in drill-override.conf. You must set zk.connect to > the > > correct zk host:port. > > > > > > On Sun, Oct 27, 2013 at 2:00 PM, Michael Hausenblas < > > michael.hausenblas@gmail.com> wrote: > > > >> > >> Folks, > >> > >> I=92m trying to set up Drill in distributed mode. Here=92s what I have= so > far: > >> when I launch the first Drillbit with bin/drillbit.sh I get the > following > >> in log/drillbit.out: > >> > >> [[ > >> 20:47:20.963 [main] ERROR com.netflix.curator.ConnectionState - > Connection > >> timed out for connection string (localhost:2181) and timeout (5000) / > >> elapsed (5045) > >> org.apache.zookeeper.KeeperException$ConnectionLossException: > >> KeeperErrorCode =3D ConnectionLoss > >> at > >> > com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:94) > >> ~[curator-client-1.1.9.jar:na] > >> at > >> > com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperC= lient.java:106) > >> [curator-client-1.1.9.jar:na] > >> at > >> > com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(Cura= torFrameworkImpl.java:393) > >> [curator-framework-1.1.9.jar:na] > >> at > >> > com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChild= renBuilderImpl.java:184) > >> [curator-framework-1.1.9.jar:na] > >> at > >> > com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChild= renBuilderImpl.java:173) > >> [curator-framework-1.1.9.jar:na] > >> at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:8= 5) > >> [curator-client-1.1.9.jar:na] > >> at > >> > com.netflix.curator.framework.imps.GetChildrenBuilderImpl.pathInForegroun= d(GetChildrenBuilderImpl.java:169) > >> [curator-framework-1.1.9.jar:na] > >> at > >> > com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChil= drenBuilderImpl.java:161) > >> [curator-framework-1.1.9.jar:na] > >> at > >> > com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChil= drenBuilderImpl.java:36) > >> [curator-framework-1.1.9.jar:na] > >> at > >> > com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.getChildrenW= atched(ServiceDiscoveryImpl.java:306) > >> [curator-x-discovery-1.1.9.jar:na] > >> at > >> > com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.queryForInst= ances(ServiceDiscoveryImpl.java:276) > >> [curator-x-discovery-1.1.9.jar:na] > >> at > >> > com.netflix.curator.x.discovery.details.ServiceCache.refresh(ServiceCache= .java:193) > >> [curator-x-discovery-1.1.9.jar:na] > >> at > >> > com.netflix.curator.x.discovery.details.ServiceCache.start(ServiceCache.j= ava:116) > >> [curator-x-discovery-1.1.9.jar:na] > >> at > >> > org.apache.drill.exec.coord.ZKClusterCoordinator.start(ZKClusterCoordinat= or.java:89) > >> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] > >> at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:94) > >> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] > >> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:56= ) > >> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] > >> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:43= ) > >> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] > >> at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:65) > >> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] > >> ]] > >> > >> This seems to be a known issue? See > >> > http://stackoverflow.com/questions/16056751/curator-zookeeper-client-keep= s-throw-out-connectionlossexception-per-connection > >> > >> Any ideas? Did anyone actually run Drill in distributed mode already a= nd > >> if so, how did you overcome the above issue? > >> > >> What is next? How do I make other Drillbits point to the same ZK > cluster? > >> And has anyone an example of the call parameters for bin/submit_plan > maybe > >> as well? > >> > >> > >> BTW, in the process of trying to figure what=92s going on behind the > scene I > >> traced down the startup call dependencies (scripts), available via: > >> > >> > >> > https://docs.google.com/drawings/d/1-ADIGJ-lBr-dOrOjMpQlProiZjYjjuM0kR6A8= 1BYwKA/edit?usp=3Dsharing > >> > >> which we could then also use for documentation purposes. > >> > >> > >> Cheers, > >> Michael > >> > >> -- > >> Michael Hausenblas > >> Ireland, Europe > >> http://mhausenblas.info/ > >> > >> > > --047d7bb04bc233b23d04e9c059ca--