Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 84240200B84 for ; Tue, 6 Sep 2016 00:48:25 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 82CF2160ACC; Mon, 5 Sep 2016 22:48:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9E8A0160ABC for ; Tue, 6 Sep 2016 00:48:24 +0200 (CEST) Received: (qmail 80166 invoked by uid 500); 5 Sep 2016 22:48:23 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 80155 invoked by uid 99); 5 Sep 2016 22:48:23 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Sep 2016 22:48:23 +0000 Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id AFC361A0055 for ; Mon, 5 Sep 2016 22:48:22 +0000 (UTC) Received: by mail-wm0-f46.google.com with SMTP id w12so62019120wmf.0 for ; Mon, 05 Sep 2016 15:48:22 -0700 (PDT) X-Gm-Message-State: AE9vXwMQDvQgaZ62MqId4HJUt6CO66NTOC3Ty09TqnTShplX+cemQfDm9C5tOpESUGW/LqJ1HYVXSVZmP479Zg== X-Received: by 10.28.92.71 with SMTP id q68mr18238824wmb.85.1473115701100; Mon, 05 Sep 2016 15:48:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.114.9 with HTTP; Mon, 5 Sep 2016 15:47:40 -0700 (PDT) In-Reply-To: References: <06089C15-4B61-46C4-A7F7-F11F866275B6@gmail.com> From: Dima Spivak Date: Mon, 5 Sep 2016 15:47:40 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: HBase on docker NotServingRegionException because of hostname alisas To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001a1146fd9e13daac053bca7a31 archived-at: Mon, 05 Sep 2016 22:48:25 -0000 --001a1146fd9e13daac053bca7a31 Content-Type: text/plain; charset=UTF-8 Sounds good, Pierre. FWIW, if you want a preview, here's how to get a 5-node HBase cluster running based on the master branch of HBase in about a minute: 1. Source the clusterdock.sh script that defines the clusterdock_ helper functions: source /dev/stdin <<< "$(curl -sL http://tiny.cloudera.com/clusterdock.sh)" 2. Start up a cluster: CLUSTERDOCK_TOPOLOGY_IMAGE= hbasejenkinsuser-docker-hbase.bintray.io/dev/clusterdock:apache_hbase_topology clusterdock_run ./bin/start_cluster -r hbasejenkinsuser-docker-hbase.bintray.io --namespace dev apache_hbase --hbase-version=master --hadoop-version=2.7.1 --secondary-nodes='node-{2..5}' And that's it. Feel free to put a -h for help information (put it right after the ./bin/start_cluster for details about the function or after the apache_hbase for details about the Apache HBase topology. -Dima On Mon, Sep 5, 2016 at 3:44 PM, Pierre Caserta wrote: > Thanks for your answer. > I will check the ticket https://issues.apache.org/jira/browse/HBASE-15961 > regularly and try > clusterdock as soon as the documentation comes out. > I will try to use hostname with domain like: master.hadoopnet.com < > http://master.hadoopnet.com/> and network named hadoopnet.com < > http://hadoopnet.com/> to try if this resolve the problem. > Currently my hostnames are hadoop-master, hadoop-slave1 and hadoop-slave2, > maybe that is the problem. > > > On 5 Sep 2016, at 23:31, Dima Spivak wrote: > > > > clusterdock uses --net=host for running the framework out of a container, > > but each Hadoop/HBase cluster itself runs with its own bridge network. > Just > > suggesting clusterdock since it's what we now use for testing HBase > > releases and it looks a bit more sophisticated than this other project > > (e.g. no need to rebuild images for different cluster sizes). > > > > The error you're seeing is caused by not using the FQDN of the containers > > when referring to them; Docker networks use the network name as the > domain. > > > > On Monday, September 5, 2016, Pierre Caserta > > > wrote: > > > >> That is a good script thanks but I would like to understand exactly what > >> is the problem with my config without adding another level of > abstraction > >> and just running the clusterdock command. > >> In your script I can see that you are using --net=host. I think this is > >> the main difference compared to what I am doing which is creating a > bridge > >> network for the hadoop cluster. > >> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2. > >> > >> Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui? > >> It looks like the network name is used as part of the hostname. > >> Any idea what it is happening in my case? > >> > >> Pierre > >> > >>> On 5 Sep 2016, at 16:48, Dima Spivak >> > wrote: > >>> > >>> You should try the Apache HBase topology for clusterdock that was > >> committed > >>> a few months back. See HBASE-12721 for details. > >>> > >>> On Sunday, September 4, 2016, Pierre Caserta > >> > > >>> wrote: > >>> > >>>> Hi, > >>>> I am building a fully distributed hbase cluster with unmanaged > >> zookeeper. > >>>> I pretty much used this example and install hbase on top of it: > >>>> https://github.com/kiwenlau/hadoop-cluster-docker > >>>> > >>>> Hadoop and hdfs works fine but I get this exception with hbase: > >>>> > >>>> 2016-09-05 06:27:12,268 INFO [hadoop-master:16000. > >> activeMasterManager] > >>>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at > >>>> address=hadoop-slave2,16020,1473052276351, > exception=org.apache.hadoop. > >>>> hbase.NotServingRegionException: Region hbase:meta,,1 is not online > on > >>>> hadoop-slave2.hadoopnet,16020,1473056813966 > >>>> at org.apache.hadoop.hbase.regionserver.HRegionServer. > >>>> getRegionByEncodedName(HRegionServer.java:2910) > >>>> > >>>> This is bloking because any command I enter on the hbase shell will > >> return > >>>> the following error: > >>>> > >>>> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is > >>>> initializing > >>>> > >>>> The containers are runned using --net=hadoopnet > >>>> which is a network create as such: > >>>> > >>>> docker network create --driver=bridge hadoopnet > >>>> > >>>> The hbase webui is showing this: > >>>> > >>>> Region Servers > >>>> ServerName Start time Version Requests Per Second Num. > >>>> Regions > >>>> hadoop-slave1,16020,1473056814064 Mon Sep 05 06:26:54 UTC 2016 > >>>> 1.2.2 0 0 > >>>> hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 UTC > >>>> 2016 Unknown 0 0 > >>>> hadoop-slave2,16020,1473056813966 Mon Sep 05 06:26:53 UTC 2016 > >>>> 1.2.2 0 0 > >>>> hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 UTC > >>>> 2016 Unknown 0 0 > >>>> Total:4 2 nodes with inconsistent version 0 > 0 > >>>> > >>>> I should have only 2 regionservers but 2 strange > hadoop-slave1.hadoopnet > >>>> and hadoop-slave2.hadoopnet are added to the list. > >>>> When I look at zk using: > >>>> > >>>> /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs > >>>> > >>>> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and > >>>> hadoop-slave2,16020,1473056813966 > >>>> > >>>> Looking at the zookeeper.MetaTableLocator: Failed verification error I > >> see > >>>> that hadoop-slave2,16020,1473052276351 and > >> hadoop-slave2.hadoopnet,16020,1473056813966 > >>>> get mixed up. > >>>> > >>>> here is my config on all server > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> hbase.rootdir > >>>> hdfs://hadoop-master:9000/hbase > >>>> The directory shared by region servers. > Should > >>>> be fully-qualified to include the filesystem to use. E.g: > >>>> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR > >>>> > >>>> > >>>> hbase.master > >>>> hdfs://hadoop-master:60000 > >>>> The host and port that the HBase master runs > >>>> at. > >>>> > >>>> > >>>> hbase.cluster.distributed > >>>> true > >>>> The mode the cluster will be in. Possible > >>>> values are > >>>> false: standalone and pseudo-distributed setups with > >> managed > >>>> Zookeeper > >>>> true: fully-distributed with unmanaged Zookeeper Quorum > >> (see > >>>> hbase-env.sh) > >>>> > >>>> > >>>> hbase.master.info.port > >>>> 60010 > >>>> The UI interface of HBase master > >>>> runs. > >>>> > >>>> > >>>> hbase.zookeeper.quorum > >>>> zk > >>>> string m_e_m_b_e_r_s is replaced by list of > >>>> hosts separated by comma. Its generated by configure-slaves.sh on > master > >>>> node > >>>> > >>>> > >>>> hbase.zookeeper.property.maxClientCnxns > >>>> 300 > >>>> > >>>> > >>>> hbase.zookeeper.property.datadir > >>>> /tmp/zookeeper > >>>> location of storage of zookeeper > >>>> data > >>>> > >>>> > >>>> hbase.zookeeper.property.clientPort > >>>> 2181 > >>>> > >>>> > >>>> > >>>> > >>>> I created a stack overflow question as well: > http://stackoverflow.com/ > >>>> questions/39325041/hbase-on-docker-notservingregionexception- > >>>> because-of-hostname-alisas >>>> questions/39325041/hbase-on-docker-notservingregionexception- > >>>> because-of-hostname-alisas> > >>>> > >>>> Thanks, > >>>> Pierre > >>> > >>> > >>> > >>> -- > >>> -Dima > >> > >> > > > > -- > > -Dima > > --001a1146fd9e13daac053bca7a31--