hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dima Spivak <dimaspi...@apache.org>
Subject Re: HBase on docker NotServingRegionException because of hostname alisas
Date Tue, 06 Sep 2016 04:51:41 GMT
Hey Pierre,

Sorry, I just don't think it's worth the time trying to debug this
framework when a more robust one exists. Perhaps try reaching out to
"kiwenlau?"

-Dima

On Mon, Sep 5, 2016 at 9:49 PM, Pierre Caserta <pierre.caserta@gmail.com>
wrote:

> Thanks Dima,
> Now even if I use a network called hadoopnet.com <http://hadoopnet.com/>
> I still have the same problem.
> Here are my regionservers that get detected:
>
> Region Servers
> Base Stats
>  <http://192.168.99.100:33224/master-status#tab_baseStats>Memory
>  <http://192.168.99.100:33224/master-status#tab_memoryStats>Requests
>  <http://192.168.99.100:33224/master-status#tab_requestStats>Storefiles
>  <http://192.168.99.100:33224/master-status#tab_storeStats>Compactions
>  <http://192.168.99.100:33224/master-status#tab_compactStas>
> ServerName      Start time      Version Requests Per Second     Num.
> Regions
> hadoop-slave1.hadoopnet.com,16020,1473137128613 <http://hadoop-slave1.
> hadoopnet.com:16030/rs-status>    Tue Sep 06 04:45:28 UTC 2016    1.2.2
>  0       0
> hadoop-slave1.hadoopnet.com.hadoopnet.com,16020,1473137128613 <
> http://hadoop-slave1.hadoopnet.com.hadoopnet.com:60010/rs-status>
> Tue Sep 06 04:45:28 UTC 2016    Unknown 0       0
> hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2.
> hadoopnet.com:16030/rs-status>    Tue Sep 06 04:45:27 UTC 2016    1.2.2
>  0       0
> hadoop-slave2.hadoopnet.com.hadoopnet.com,16020,1473137127975 <
> http://hadoop-slave2.hadoopnet.com.hadoopnet.com:60010/rs-status>
> Tue Sep 06 04:45:27 UTC 2016    Unknown 0       0
> Total:4         2 nodes with inconsistent version       0       0
> instead of just hadoop-slave1.hadoopnet.com,16020,1473137128613 <
> http://hadoop-slave1.hadoopnet.com:16030/rs-status> and
> hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2.
> hadoopnet.com:16030/rs-status>
> This is the script I used to start the hadoop cluster
>
> ---
> #!/bin/bash
>
> # the default node number is 3
> N=${1:-3}
>
>
> NETWORK=hadoopnet.com
> docker rm -f zk.$NETWORK &> /dev/null
> echo "start zk container..."
> docker run -p 2181:2181 --name zk.$NETWORK --hostname zk.$NETWORK
> --net=$NETWORK -itd -v conf:/opt/zookeeper/conf -v data:/tmp/zookeeper
> jplock/zookeeper
>
> # start hadoop master container
> docker rm -f hadoop-master.$NETWORK &> /dev/null
> echo "start hadoop-master container..."
> docker run -itd \
>                 --net=$NETWORK \
>                 -P \
>                 --name hadoop-master.$NETWORK \
>                 --hostname hadoop-master.$NETWORK \
>                 --add-host zk.$NETWORK:$(docker inspect -f "{{with index
> .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}"
> zk.$NETWORK) \
>                 casertap/hhb
>
>
> # start hadoop slave container
> i=1
> while [ $i -lt $N ]
> do
>         docker rm -f hadoop-slave$i.$NETWORK &> /dev/null
>         echo "start hadoop-slave$i container..."
>         docker run -itd \
>                         --net=$NETWORK \
>                         --name hadoop-slave$i.$NETWORK \
>                         --hostname hadoop-slave$i.$NETWORK \
>                   --publish-all=false \
>                   --add-host hadoop-master.$NETWORK:$(docker inspect -f
> "{{with index .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}"
> hadoop-master.$NETWORK) \
>                   --add-host zk.$NETWORK:$(docker inspect -f "{{with index
> .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}"
> zk.$NETWORK) \
>                         casertap/hhb
>         i=$(( $i + 1 ))
> done
>
> # get into hadoop master container
> docker exec -it hadoop-master.$NETWORK bash
> ---
>
> Thanks,
> pierre
>
> > On 6 Sep 2016, at 08:47, Dima Spivak <dimaspivak@apache.org> wrote:
> >
> > Sounds good, Pierre. FWIW, if you want a preview, here's how to get a
> > 5-node HBase cluster running based on the master branch of HBase in
> about a
> > minute:
> >
> > 1. Source the clusterdock.sh script that defines the clusterdock_ helper
> > functions: source /dev/stdin <<< "$(curl -sL
> > http://tiny.cloudera.com/clusterdock.sh <http://tiny.cloudera.com/
> clusterdock.sh>)"
> > 2. Start up a cluster: CLUSTERDOCK_TOPOLOGY_IMAGE=
> > hbasejenkinsuser-docker-hbase.bintray.io/dev/clusterdock:
> apache_hbase_topology
> > clusterdock_run ./bin/start_cluster -r
> > hbasejenkinsuser-docker-hbase.bintray.io --namespace dev apache_hbase
> > --hbase-version=master --hadoop-version=2.7.1
> > --secondary-nodes='node-{2..5}'
> >
> > And that's it. Feel free to put a -h for help information (put it right
> > after the ./bin/start_cluster for details about the function or after the
> > apache_hbase for details about the Apache HBase topology.
> >
> > -Dima
> >
> > On Mon, Sep 5, 2016 at 3:44 PM, Pierre Caserta <pierre.caserta@gmail.com
> <mailto:pierre.caserta@gmail.com>>
> > wrote:
> >
> >> Thanks for your answer.
> >> I will check the ticket https://issues.apache.org/
> jira/browse/HBASE-15961 <https://issues.apache.org/jira/browse/HBASE-15961
> >
> >> <https://issues.apache.org/jira/browse/HBASE-15961 <
> https://issues.apache.org/jira/browse/HBASE-15961>> regularly and try
> >> clusterdock as soon as the documentation comes out.
> >> I will try to use hostname with domain like: master.hadoopnet.com <
> http://master.hadoopnet.com/> <
> >> http://master.hadoopnet.com/ <http://master.hadoopnet.com/>> and
> network named hadoopnet.com <http://hadoopnet.com/> <
> >> http://hadoopnet.com/ <http://hadoopnet.com/>> to try if this resolve
> the problem.
> >> Currently my hostnames are hadoop-master, hadoop-slave1 and
> hadoop-slave2,
> >> maybe that is the problem.
> >>
> >>> On 5 Sep 2016, at 23:31, Dima Spivak <dimaspivak@apache.org> wrote:
> >>>
> >>> clusterdock uses --net=host for running the framework out of a
> container,
> >>> but each Hadoop/HBase cluster itself runs with its own bridge network.
> >> Just
> >>> suggesting clusterdock since it's what we now use for testing HBase
> >>> releases and it looks a bit more sophisticated than this other project
> >>> (e.g. no need to rebuild images for different cluster sizes).
> >>>
> >>> The error you're seeing is caused by not using the FQDN of the
> containers
> >>> when referring to them; Docker networks use the network name as the
> >> domain.
> >>>
> >>> On Monday, September 5, 2016, Pierre Caserta <pierre.caserta@gmail.com
> >> <mailto:pierre.caserta@gmail.com <mailto:pierre.caserta@gmail.com>>>
> >>> wrote:
> >>>
> >>>> That is a good script thanks but I would like to understand exactly
> what
> >>>> is the problem with my config without adding another level of
> >> abstraction
> >>>> and just running the clusterdock command.
> >>>> In your script I can see that you are using --net=host. I think this
> is
> >>>> the main difference compared to what I am doing which is creating a
> >> bridge
> >>>> network for the hadoop cluster.
> >>>> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2.
> >>>>
> >>>> Why do those strange hadoop-slave2.hadoopnet alias appear in the web
> ui?
> >>>> It looks like the network name is used as part of the hostname.
> >>>> Any idea what it is happening in my case?
> >>>>
> >>>> Pierre
> >>>>
> >>>>> On 5 Sep 2016, at 16:48, Dima Spivak <dimaspivak@apache.org
> >>>> <javascript:;>> wrote:
> >>>>>
> >>>>> You should try the Apache HBase topology for clusterdock that was
> >>>> committed
> >>>>> a few months back. See HBASE-12721 for details.
> >>>>>
> >>>>> On Sunday, September 4, 2016, Pierre Caserta <
> pierre.caserta@gmail.com
> >> <mailto:pierre.caserta@gmail.com <mailto:pierre.caserta@gmail.com>>
> >>>> <javascript:;>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>> I am building a fully distributed hbase cluster with unmanaged
> >>>> zookeeper.
> >>>>>> I pretty much used this example and install hbase on top of
it:
> >>>>>> https://github.com/kiwenlau/hadoop-cluster-docker
> >>>>>>
> >>>>>> Hadoop and hdfs works fine but I get this exception with hbase:
> >>>>>>
> >>>>>>  2016-09-05 06:27:12,268 INFO  [hadoop-master:16000.
> >>>> activeMasterManager]
> >>>>>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1
at
> >>>>>> address=hadoop-slave2,16020,1473052276351,
> >> exception=org.apache.hadoop.
> >>>>>> hbase.NotServingRegionException: Region hbase:meta,,1 is not
online
> >> on
> >>>>>> hadoop-slave2.hadoopnet,16020,1473056813966
> >>>>>>      at org.apache.hadoop.hbase.regionserver.HRegionServer.
> >>>>>> getRegionByEncodedName(HRegionServer.java:2910)
> >>>>>>
> >>>>>> This is bloking because any command I enter on the hbase shell
will
> >>>> return
> >>>>>> the following error:
> >>>>>>
> >>>>>>  ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master
is
> >>>>>> initializing
> >>>>>>
> >>>>>> The containers are runned using --net=hadoopnet
> >>>>>> which is a network create as such:
> >>>>>>
> >>>>>>  docker network create --driver=bridge hadoopnet
> >>>>>>
> >>>>>> The hbase webui is showing this:
> >>>>>>
> >>>>>>  Region Servers
> >>>>>>  ServerName  Start time      Version Requests Per Second   
 Num.
> >>>>>> Regions
> >>>>>>  hadoop-slave1,16020,1473056814064   Mon Sep 05 06:26:54 UTC
2016
> >>>>>> 1.2.2   0       0
> >>>>>>  hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54
> UTC
> >>>>>> 2016    Unknown 0       0
> >>>>>>  hadoop-slave2,16020,1473056813966   Mon Sep 05 06:26:53 UTC
2016
> >>>>>> 1.2.2   0       0
> >>>>>>  hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53
> UTC
> >>>>>> 2016    Unknown 0       0
> >>>>>>  Total:4             2 nodes with inconsistent version     
 0
> >> 0
> >>>>>>
> >>>>>> I should have only 2 regionservers but 2 strange
> >> hadoop-slave1.hadoopnet
> >>>>>> and hadoop-slave2.hadoopnet are added to the list.
> >>>>>> When I look at zk using:
> >>>>>>
> >>>>>>  /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs
> >>>>>>
> >>>>>> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064
and
> >>>>>> hadoop-slave2,16020,1473056813966
> >>>>>>
> >>>>>> Looking at the zookeeper.MetaTableLocator: Failed verification
> error I
> >>>> see
> >>>>>> that  hadoop-slave2,16020,1473052276351 and
> >>>> hadoop-slave2.hadoopnet,16020,1473056813966
> >>>>>> get mixed up.
> >>>>>>
> >>>>>> here is my config on all server
> >>>>>>
> >>>>>>      <?xml version="1.0" encoding="UTF-8"?>
> >>>>>>      <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >>>>>>
> >>>>>>      <configuration>
> >>>>>>        <property>
> >>>>>>              <name>hbase.rootdir</name>
> >>>>>>          <value>hdfs://hadoop-master:9000/hbase</value>
> >>>>>>            <description>The directory shared by region
servers.
> >> Should
> >>>>>> be fully-qualified to include the filesystem to use. E.g:
> >>>>>> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR</description>
> >>>>>>        </property>
> >>>>>>        <property>
> >>>>>>            <name>hbase.master</name>
> >>>>>>            <value>hdfs://hadoop-master:60000</value>
> >>>>>>            <description>The host and port that the HBase
master runs
> >>>>>> at.</description>
> >>>>>>        </property>
> >>>>>>        <property>
> >>>>>>            <name>hbase.cluster.distributed</name>
> >>>>>>            <value>true</value>
> >>>>>>            <description>The mode the cluster will be in.
Possible
> >>>>>> values are
> >>>>>>            false: standalone and pseudo-distributed setups with
> >>>> managed
> >>>>>> Zookeeper
> >>>>>>            true: fully-distributed with unmanaged Zookeeper
Quorum
> >>>> (see
> >>>>>> hbase-env.sh)</description>
> >>>>>>        </property>
> >>>>>>        <property>
> >>>>>>            <name>hbase.master.info.port</name>
> >>>>>>            <value>60010</value>
> >>>>>>            <description>The UI interface of HBase master
> >>>>>> runs.</description>
> >>>>>>        </property>
> >>>>>>        <property>
> >>>>>>            <name>hbase.zookeeper.quorum</name>
> >>>>>>            <value>zk</value>
> >>>>>>            <description>string m_e_m_b_e_r_s is replaced
by list of
> >>>>>> hosts separated by comma. Its generated by configure-slaves.sh
on
> >> master
> >>>>>> node</description>
> >>>>>>        </property>
> >>>>>>        <property>
> >>>>>>            <name>hbase.zookeeper.property.maxClientCnxns</name>
> >>>>>>            <value>300</value>
> >>>>>>        </property>
> >>>>>>        <property>
> >>>>>>            <name>hbase.zookeeper.property.datadir</name>
> >>>>>>            <value>/tmp/zookeeper</value>
> >>>>>>            <description>location of storage of zookeeper
> >>>>>> data</description>
> >>>>>>        </property>
> >>>>>>        <property>
> >>>>>>            <name>hbase.zookeeper.property.clientPort</name>
> >>>>>>            <value>2181</value>
> >>>>>>        </property>
> >>>>>>
> >>>>>>      </configuration>
> >>>>>>
> >>>>>> I created a stack overflow question as well:
> >> http://stackoverflow.com/
> >>>>>> questions/39325041/hbase-on-docker-notservingregionexception-
> >>>>>> because-of-hostname-alisas <http://stackoverflow.com/
> >>>>>> questions/39325041/hbase-on-docker-notservingregionexception-
> >>>>>> because-of-hostname-alisas>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Pierre
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> -Dima
> >>>>
> >>>>
> >>>
> >>> --
> >>> -Dima
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message