Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 998FD200B7A for ; Mon, 5 Sep 2016 15:31:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 98333160ACB; Mon, 5 Sep 2016 13:31:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B65A8160ABC for ; Mon, 5 Sep 2016 15:31:13 +0200 (CEST) Received: (qmail 87187 invoked by uid 500); 5 Sep 2016 13:31:12 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 87176 invoked by uid 99); 5 Sep 2016 13:31:12 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Sep 2016 13:31:12 +0000 Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 8544E1A003E for ; Mon, 5 Sep 2016 13:31:11 +0000 (UTC) Received: by mail-wm0-f44.google.com with SMTP id w2so122460209wmd.0 for ; Mon, 05 Sep 2016 06:31:11 -0700 (PDT) X-Gm-Message-State: AE9vXwNFIcj/orXYd6jCVvBMaBUrOl5J45RYyoS+l1mTC3oBj+dldZA8V2rCzgGdm/hyP+KpnIQEuDR0pfDvhg== X-Received: by 10.28.166.197 with SMTP id p188mr9108125wme.85.1473082269748; Mon, 05 Sep 2016 06:31:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.114.9 with HTTP; Mon, 5 Sep 2016 06:31:08 -0700 (PDT) In-Reply-To: References: <06089C15-4B61-46C4-A7F7-F11F866275B6@gmail.com> From: Dima Spivak Date: Mon, 5 Sep 2016 06:31:08 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: HBase on docker NotServingRegionException because of hostname alisas To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=94eb2c129f2a6a43ea053bc2b1fb archived-at: Mon, 05 Sep 2016 13:31:14 -0000 --94eb2c129f2a6a43ea053bc2b1fb Content-Type: text/plain; charset=UTF-8 clusterdock uses --net=host for running the framework out of a container, but each Hadoop/HBase cluster itself runs with its own bridge network. Just suggesting clusterdock since it's what we now use for testing HBase releases and it looks a bit more sophisticated than this other project (e.g. no need to rebuild images for different cluster sizes). The error you're seeing is caused by not using the FQDN of the containers when referring to them; Docker networks use the network name as the domain. On Monday, September 5, 2016, Pierre Caserta wrote: > That is a good script thanks but I would like to understand exactly what > is the problem with my config without adding another level of abstraction > and just running the clusterdock command. > In your script I can see that you are using --net=host. I think this is > the main difference compared to what I am doing which is creating a bridge > network for the hadoop cluster. > I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2. > > Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui? > It looks like the network name is used as part of the hostname. > Any idea what it is happening in my case? > > Pierre > > > On 5 Sep 2016, at 16:48, Dima Spivak > wrote: > > > > You should try the Apache HBase topology for clusterdock that was > committed > > a few months back. See HBASE-12721 for details. > > > > On Sunday, September 4, 2016, Pierre Caserta > > > wrote: > > > >> Hi, > >> I am building a fully distributed hbase cluster with unmanaged > zookeeper. > >> I pretty much used this example and install hbase on top of it: > >> https://github.com/kiwenlau/hadoop-cluster-docker > >> > >> Hadoop and hdfs works fine but I get this exception with hbase: > >> > >> 2016-09-05 06:27:12,268 INFO [hadoop-master:16000. > activeMasterManager] > >> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at > >> address=hadoop-slave2,16020,1473052276351, exception=org.apache.hadoop. > >> hbase.NotServingRegionException: Region hbase:meta,,1 is not online on > >> hadoop-slave2.hadoopnet,16020,1473056813966 > >> at org.apache.hadoop.hbase.regionserver.HRegionServer. > >> getRegionByEncodedName(HRegionServer.java:2910) > >> > >> This is bloking because any command I enter on the hbase shell will > return > >> the following error: > >> > >> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is > >> initializing > >> > >> The containers are runned using --net=hadoopnet > >> which is a network create as such: > >> > >> docker network create --driver=bridge hadoopnet > >> > >> The hbase webui is showing this: > >> > >> Region Servers > >> ServerName Start time Version Requests Per Second Num. > >> Regions > >> hadoop-slave1,16020,1473056814064 Mon Sep 05 06:26:54 UTC 2016 > >> 1.2.2 0 0 > >> hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 UTC > >> 2016 Unknown 0 0 > >> hadoop-slave2,16020,1473056813966 Mon Sep 05 06:26:53 UTC 2016 > >> 1.2.2 0 0 > >> hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 UTC > >> 2016 Unknown 0 0 > >> Total:4 2 nodes with inconsistent version 0 0 > >> > >> I should have only 2 regionservers but 2 strange hadoop-slave1.hadoopnet > >> and hadoop-slave2.hadoopnet are added to the list. > >> When I look at zk using: > >> > >> /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs > >> > >> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and > >> hadoop-slave2,16020,1473056813966 > >> > >> Looking at the zookeeper.MetaTableLocator: Failed verification error I > see > >> that hadoop-slave2,16020,1473052276351 and > hadoop-slave2.hadoopnet,16020,1473056813966 > >> get mixed up. > >> > >> here is my config on all server > >> > >> > >> > >> > >> > >> > >> hbase.rootdir > >> hdfs://hadoop-master:9000/hbase > >> The directory shared by region servers. Should > >> be fully-qualified to include the filesystem to use. E.g: > >> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR > >> > >> > >> hbase.master > >> hdfs://hadoop-master:60000 > >> The host and port that the HBase master runs > >> at. > >> > >> > >> hbase.cluster.distributed > >> true > >> The mode the cluster will be in. Possible > >> values are > >> false: standalone and pseudo-distributed setups with > managed > >> Zookeeper > >> true: fully-distributed with unmanaged Zookeeper Quorum > (see > >> hbase-env.sh) > >> > >> > >> hbase.master.info.port > >> 60010 > >> The UI interface of HBase master > >> runs. > >> > >> > >> hbase.zookeeper.quorum > >> zk > >> string m_e_m_b_e_r_s is replaced by list of > >> hosts separated by comma. Its generated by configure-slaves.sh on master > >> node > >> > >> > >> hbase.zookeeper.property.maxClientCnxns > >> 300 > >> > >> > >> hbase.zookeeper.property.datadir > >> /tmp/zookeeper > >> location of storage of zookeeper > >> data > >> > >> > >> hbase.zookeeper.property.clientPort > >> 2181 > >> > >> > >> > >> > >> I created a stack overflow question as well: http://stackoverflow.com/ > >> questions/39325041/hbase-on-docker-notservingregionexception- > >> because-of-hostname-alisas >> questions/39325041/hbase-on-docker-notservingregionexception- > >> because-of-hostname-alisas> > >> > >> Thanks, > >> Pierre > > > > > > > > -- > > -Dima > > -- -Dima --94eb2c129f2a6a43ea053bc2b1fb--