hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierre Caserta <pierre.case...@gmail.com>
Subject Re: HBase on docker NotServingRegionException because of hostname alisas
Date Tue, 06 Sep 2016 04:49:10 GMT
Thanks Dima,
Now even if I use a network called hadoopnet.com <http://hadoopnet.com/> I still have
the same problem.
Here are my regionservers that get detected:

Region Servers
Base Stats
 <http://192.168.99.100:33224/master-status#tab_baseStats>Memory
 <http://192.168.99.100:33224/master-status#tab_memoryStats>Requests
 <http://192.168.99.100:33224/master-status#tab_requestStats>Storefiles
 <http://192.168.99.100:33224/master-status#tab_storeStats>Compactions
 <http://192.168.99.100:33224/master-status#tab_compactStas>
ServerName	Start time	Version	Requests Per Second	Num. Regions
hadoop-slave1.hadoopnet.com,16020,1473137128613 <http://hadoop-slave1.hadoopnet.com:16030/rs-status>
Tue Sep 06 04:45:28 UTC 2016	1.2.2	0	0
hadoop-slave1.hadoopnet.com.hadoopnet.com,16020,1473137128613 <http://hadoop-slave1.hadoopnet.com.hadoopnet.com:60010/rs-status>
Tue Sep 06 04:45:28 UTC 2016	Unknown	0	0
hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2.hadoopnet.com:16030/rs-status>
Tue Sep 06 04:45:27 UTC 2016	1.2.2	0	0
hadoop-slave2.hadoopnet.com.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2.hadoopnet.com.hadoopnet.com:60010/rs-status>
Tue Sep 06 04:45:27 UTC 2016	Unknown	0	0
Total:4		2 nodes with inconsistent version	0	0
instead of just hadoop-slave1.hadoopnet.com,16020,1473137128613 <http://hadoop-slave1.hadoopnet.com:16030/rs-status>
and hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2.hadoopnet.com:16030/rs-status>
This is the script I used to start the hadoop cluster

---
#!/bin/bash

# the default node number is 3
N=${1:-3}


NETWORK=hadoopnet.com
docker rm -f zk.$NETWORK &> /dev/null
echo "start zk container..."
docker run -p 2181:2181 --name zk.$NETWORK --hostname zk.$NETWORK --net=$NETWORK -itd -v conf:/opt/zookeeper/conf
-v data:/tmp/zookeeper jplock/zookeeper

# start hadoop master container
docker rm -f hadoop-master.$NETWORK &> /dev/null
echo "start hadoop-master container..."
docker run -itd \
                --net=$NETWORK \
                -P \
                --name hadoop-master.$NETWORK \
                --hostname hadoop-master.$NETWORK \
                --add-host zk.$NETWORK:$(docker inspect -f "{{with index .NetworkSettings.Networks
\"${NETWORK}\"}}{{.IPAddress}}{{end}}" zk.$NETWORK) \
                casertap/hhb


# start hadoop slave container
i=1
while [ $i -lt $N ]
do
	docker rm -f hadoop-slave$i.$NETWORK &> /dev/null
	echo "start hadoop-slave$i container..."
	docker run -itd \
	                --net=$NETWORK \
	                --name hadoop-slave$i.$NETWORK \
	                --hostname hadoop-slave$i.$NETWORK \
                  --publish-all=false \
                  --add-host hadoop-master.$NETWORK:$(docker inspect -f "{{with index .NetworkSettings.Networks
\"${NETWORK}\"}}{{.IPAddress}}{{end}}" hadoop-master.$NETWORK) \
                  --add-host zk.$NETWORK:$(docker inspect -f "{{with index .NetworkSettings.Networks
\"${NETWORK}\"}}{{.IPAddress}}{{end}}" zk.$NETWORK) \
	                casertap/hhb
	i=$(( $i + 1 ))
done

# get into hadoop master container
docker exec -it hadoop-master.$NETWORK bash
---

Thanks,
pierre

> On 6 Sep 2016, at 08:47, Dima Spivak <dimaspivak@apache.org> wrote:
> 
> Sounds good, Pierre. FWIW, if you want a preview, here's how to get a
> 5-node HBase cluster running based on the master branch of HBase in about a
> minute:
> 
> 1. Source the clusterdock.sh script that defines the clusterdock_ helper
> functions: source /dev/stdin <<< "$(curl -sL
> http://tiny.cloudera.com/clusterdock.sh <http://tiny.cloudera.com/clusterdock.sh>)"
> 2. Start up a cluster: CLUSTERDOCK_TOPOLOGY_IMAGE=
> hbasejenkinsuser-docker-hbase.bintray.io/dev/clusterdock:apache_hbase_topology
> clusterdock_run ./bin/start_cluster -r
> hbasejenkinsuser-docker-hbase.bintray.io --namespace dev apache_hbase
> --hbase-version=master --hadoop-version=2.7.1
> --secondary-nodes='node-{2..5}'
> 
> And that's it. Feel free to put a -h for help information (put it right
> after the ./bin/start_cluster for details about the function or after the
> apache_hbase for details about the Apache HBase topology.
> 
> -Dima
> 
> On Mon, Sep 5, 2016 at 3:44 PM, Pierre Caserta <pierre.caserta@gmail.com <mailto:pierre.caserta@gmail.com>>
> wrote:
> 
>> Thanks for your answer.
>> I will check the ticket https://issues.apache.org/jira/browse/HBASE-15961 <https://issues.apache.org/jira/browse/HBASE-15961>
>> <https://issues.apache.org/jira/browse/HBASE-15961 <https://issues.apache.org/jira/browse/HBASE-15961>>
regularly and try
>> clusterdock as soon as the documentation comes out.
>> I will try to use hostname with domain like: master.hadoopnet.com <http://master.hadoopnet.com/>
<
>> http://master.hadoopnet.com/ <http://master.hadoopnet.com/>> and network
named hadoopnet.com <http://hadoopnet.com/> <
>> http://hadoopnet.com/ <http://hadoopnet.com/>> to try if this resolve the
problem.
>> Currently my hostnames are hadoop-master, hadoop-slave1 and hadoop-slave2,
>> maybe that is the problem.
>> 
>>> On 5 Sep 2016, at 23:31, Dima Spivak <dimaspivak@apache.org> wrote:
>>> 
>>> clusterdock uses --net=host for running the framework out of a container,
>>> but each Hadoop/HBase cluster itself runs with its own bridge network.
>> Just
>>> suggesting clusterdock since it's what we now use for testing HBase
>>> releases and it looks a bit more sophisticated than this other project
>>> (e.g. no need to rebuild images for different cluster sizes).
>>> 
>>> The error you're seeing is caused by not using the FQDN of the containers
>>> when referring to them; Docker networks use the network name as the
>> domain.
>>> 
>>> On Monday, September 5, 2016, Pierre Caserta <pierre.caserta@gmail.com
>> <mailto:pierre.caserta@gmail.com <mailto:pierre.caserta@gmail.com>>>
>>> wrote:
>>> 
>>>> That is a good script thanks but I would like to understand exactly what
>>>> is the problem with my config without adding another level of
>> abstraction
>>>> and just running the clusterdock command.
>>>> In your script I can see that you are using --net=host. I think this is
>>>> the main difference compared to what I am doing which is creating a
>> bridge
>>>> network for the hadoop cluster.
>>>> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2.
>>>> 
>>>> Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui?
>>>> It looks like the network name is used as part of the hostname.
>>>> Any idea what it is happening in my case?
>>>> 
>>>> Pierre
>>>> 
>>>>> On 5 Sep 2016, at 16:48, Dima Spivak <dimaspivak@apache.org
>>>> <javascript:;>> wrote:
>>>>> 
>>>>> You should try the Apache HBase topology for clusterdock that was
>>>> committed
>>>>> a few months back. See HBASE-12721 for details.
>>>>> 
>>>>> On Sunday, September 4, 2016, Pierre Caserta <pierre.caserta@gmail.com
>> <mailto:pierre.caserta@gmail.com <mailto:pierre.caserta@gmail.com>>
>>>> <javascript:;>>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> I am building a fully distributed hbase cluster with unmanaged
>>>> zookeeper.
>>>>>> I pretty much used this example and install hbase on top of it:
>>>>>> https://github.com/kiwenlau/hadoop-cluster-docker
>>>>>> 
>>>>>> Hadoop and hdfs works fine but I get this exception with hbase:
>>>>>> 
>>>>>>  2016-09-05 06:27:12,268 INFO  [hadoop-master:16000.
>>>> activeMasterManager]
>>>>>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1
at
>>>>>> address=hadoop-slave2,16020,1473052276351,
>> exception=org.apache.hadoop.
>>>>>> hbase.NotServingRegionException: Region hbase:meta,,1 is not online
>> on
>>>>>> hadoop-slave2.hadoopnet,16020,1473056813966
>>>>>>      at org.apache.hadoop.hbase.regionserver.HRegionServer.
>>>>>> getRegionByEncodedName(HRegionServer.java:2910)
>>>>>> 
>>>>>> This is bloking because any command I enter on the hbase shell will
>>>> return
>>>>>> the following error:
>>>>>> 
>>>>>>  ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is
>>>>>> initializing
>>>>>> 
>>>>>> The containers are runned using --net=hadoopnet
>>>>>> which is a network create as such:
>>>>>> 
>>>>>>  docker network create --driver=bridge hadoopnet
>>>>>> 
>>>>>> The hbase webui is showing this:
>>>>>> 
>>>>>>  Region Servers
>>>>>>  ServerName  Start time      Version Requests Per Second     Num.
>>>>>> Regions
>>>>>>  hadoop-slave1,16020,1473056814064   Mon Sep 05 06:26:54 UTC 2016
>>>>>> 1.2.2   0       0
>>>>>>  hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54
UTC
>>>>>> 2016    Unknown 0       0
>>>>>>  hadoop-slave2,16020,1473056813966   Mon Sep 05 06:26:53 UTC 2016
>>>>>> 1.2.2   0       0
>>>>>>  hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53
UTC
>>>>>> 2016    Unknown 0       0
>>>>>>  Total:4             2 nodes with inconsistent version       0
>> 0
>>>>>> 
>>>>>> I should have only 2 regionservers but 2 strange
>> hadoop-slave1.hadoopnet
>>>>>> and hadoop-slave2.hadoopnet are added to the list.
>>>>>> When I look at zk using:
>>>>>> 
>>>>>>  /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs
>>>>>> 
>>>>>> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and
>>>>>> hadoop-slave2,16020,1473056813966
>>>>>> 
>>>>>> Looking at the zookeeper.MetaTableLocator: Failed verification error
I
>>>> see
>>>>>> that  hadoop-slave2,16020,1473052276351 and
>>>> hadoop-slave2.hadoopnet,16020,1473056813966
>>>>>> get mixed up.
>>>>>> 
>>>>>> here is my config on all server
>>>>>> 
>>>>>>      <?xml version="1.0" encoding="UTF-8"?>
>>>>>>      <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>>> 
>>>>>>      <configuration>
>>>>>>        <property>
>>>>>>              <name>hbase.rootdir</name>
>>>>>>          <value>hdfs://hadoop-master:9000/hbase</value>
>>>>>>            <description>The directory shared by region servers.
>> Should
>>>>>> be fully-qualified to include the filesystem to use. E.g:
>>>>>> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR</description>
>>>>>>        </property>
>>>>>>        <property>
>>>>>>            <name>hbase.master</name>
>>>>>>            <value>hdfs://hadoop-master:60000</value>
>>>>>>            <description>The host and port that the HBase master
runs
>>>>>> at.</description>
>>>>>>        </property>
>>>>>>        <property>
>>>>>>            <name>hbase.cluster.distributed</name>
>>>>>>            <value>true</value>
>>>>>>            <description>The mode the cluster will be in. Possible
>>>>>> values are
>>>>>>            false: standalone and pseudo-distributed setups with
>>>> managed
>>>>>> Zookeeper
>>>>>>            true: fully-distributed with unmanaged Zookeeper Quorum
>>>> (see
>>>>>> hbase-env.sh)</description>
>>>>>>        </property>
>>>>>>        <property>
>>>>>>            <name>hbase.master.info.port</name>
>>>>>>            <value>60010</value>
>>>>>>            <description>The UI interface of HBase master
>>>>>> runs.</description>
>>>>>>        </property>
>>>>>>        <property>
>>>>>>            <name>hbase.zookeeper.quorum</name>
>>>>>>            <value>zk</value>
>>>>>>            <description>string m_e_m_b_e_r_s is replaced by
list of
>>>>>> hosts separated by comma. Its generated by configure-slaves.sh on
>> master
>>>>>> node</description>
>>>>>>        </property>
>>>>>>        <property>
>>>>>>            <name>hbase.zookeeper.property.maxClientCnxns</name>
>>>>>>            <value>300</value>
>>>>>>        </property>
>>>>>>        <property>
>>>>>>            <name>hbase.zookeeper.property.datadir</name>
>>>>>>            <value>/tmp/zookeeper</value>
>>>>>>            <description>location of storage of zookeeper
>>>>>> data</description>
>>>>>>        </property>
>>>>>>        <property>
>>>>>>            <name>hbase.zookeeper.property.clientPort</name>
>>>>>>            <value>2181</value>
>>>>>>        </property>
>>>>>> 
>>>>>>      </configuration>
>>>>>> 
>>>>>> I created a stack overflow question as well:
>> http://stackoverflow.com/
>>>>>> questions/39325041/hbase-on-docker-notservingregionexception-
>>>>>> because-of-hostname-alisas <http://stackoverflow.com/
>>>>>> questions/39325041/hbase-on-docker-notservingregionexception-
>>>>>> because-of-hostname-alisas>
>>>>>> 
>>>>>> Thanks,
>>>>>> Pierre
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> -Dima
>>>> 
>>>> 
>>> 
>>> --
>>> -Dima


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message