hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Bigdatafun <sean.bigdata...@gmail.com>
Subject Re: 0.90.1 HMaster malfunction in pseudo-distributed mode
Date Wed, 01 Jun 2011 06:45:19 GMT
Sure. Thanks, St.Ack. Here are the attached HBase logs, plus the screenshot
of the region server. The /etc/hosts should be Ok I think because my Hadoop
(pseudo distributed )cluster runs well and healthy. But I post it here in
case I missed something :-0

127.0.0.1    localhost
127.0.1.1    sean-PowerEdge

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback localhost6
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Thanks,
Sean





On Mon, May 30, 2011 at 7:34 PM, Stack <stack@duboce.net> wrote:

> Odd.  I dont' see the regionserver checking into the master (maybe
> thats the way it is in pseudo-distributed and I just forgot).  Can you
> paste more master log?   I don't see the regionserver coming in in the
> snippet you've pasted so not sure how its registering itself (I see
> the timeout when we try to assign it -ROOT-).
>
> Whats in your /etc/hosts?  I see lots of locahost and 127.0.0.1.
> Maybe the two are not equated in your resolve setup?
>
> St.Ack
>
> On Sat, May 28, 2011 at 11:28 PM, Sean Bigdatafun
> <sean.bigdatafun@gmail.com> wrote:
> > I am trying for 0.90.1 (hbase-0.90.1-CDH3B4) under pseudo-dist mode, and
> met
> > the problem of HMaster crashing. Here is how I did.
> >
> > I. First I installed Hadoop pseudo cluster (hadoop-0.20.2-CDH3B4) with
> the
> > following conf edited.
> >
> > 1) core-site.xml ==>
> > <property>
> >  <name>fs.default.name</name>
> >  <value>hdfs://localhost:9000</value>
> > </property>
> >
> > 2) hdfs-site.xml ==>
> >  <property>
> >    <name>dfs.replication</name>
> >    <value>1</value>
> >  </property>
> >
> > (with above confs, start-all.sh was run, and the hadoop pseudo cluster
> > started to run happily)
> >
> >
> > Secondly, I installed hbase-0.90.1-CDH3B4 with the following conf edited.
> >
> > hbase-site.xml ==>
> >  <property>
> >    <name>hbase.rootdir</name>
> >    <value>hdfs://localhost:9000/hbase</value>
> >  </property>
> >
> >  <property>
> >    <name>hbase.cluster.distributed</name>
> >    <value>true</value>
> >  </property>
> >
> >  <property>
> >    <name>hbase.zookeeper.quorum</name>
> >    <value>localhost</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.replication</name>
> >    <value>1</value>
> >    <description>The replication count for HLog and HFile storage. Should
> > not be greater than HDFS datanode count.
> >    </description>
> >  </property>
> >
> > (with the above conf, I run the command of hbase-start.sh, and I realised
> > that HMaster did not function well -- i can't access localhost:60010)
> >
> >
> > II. Here is the HMaster error log:
> >
> > 2011-05-28 23:22:55,292 WARN
> > org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable
> > location to assign region -ROOT-,,0.70236052
> > 2011-05-28 23:23:35,291 INFO
> > org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition
> > timed out:  -ROOT-,,0.70236052 state=OFFLINE, ts=1306650175292
> > 2011-05-28 23:23:35,291 INFO
> > org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE
> > for too long, reassigning -ROOT-,,0.70236052 to a random server
> > 2011-05-28 23:23:35,291 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
> > was=-ROOT-,,0.70236052 state=OFFLINE, ts=1306650175292
> > 2011-05-28 23:23:35,291 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan
> > for region -ROOT-,,0.70236052; plan=hri=-ROOT-,,0.70236052, src=,
> > dest=localhost,60020,1306648534687
> > 2011-05-28 23:23:35,291 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
> > -ROOT-,,0.70236052 to localhost,60020,1306648534687
> > 2011-05-28 23:23:35,291 DEBUG
> org.apache.hadoop.hbase.master.ServerManager:
> > New connection to localhost,60020,1306648534687
> > 2011-05-28 23:23:35,292 INFO org.apache.hadoop.ipc.HbaseRPC: Server at /
> > 127.0.0.1:60020 could not be reached after 1 tries, giving up.
> > 2011-05-28 23:23:35,292 WARN
> > org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of
> > -ROOT-,,0.70236052 to serverName=localhost,60020,1306648534687,
> > load=(requests=0, regions=0, usedHeap=22, maxHeap=996), trying to assign
> > elsewhere instead; retry=0
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting
> up
> > proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to /
> > 127.0.0.1:60020 after attempts=1
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:355)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
> >        at
> >
> org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:606)
> >        at
> >
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:541)
> >        at
> >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:901)
> >        at
> >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:730)
> >        at
> >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:710)
> >        at
> >
> org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1605)
> >        at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> > Caused by: java.net.ConnectException: Connection refused
> >        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> >        at
> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> >        at
> >
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> >        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
> >        at
> >
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
> >        at
> >
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> >        at $Proxy6.getProtocolVersion(Unknown Source)
> >        at
> org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
> >        at
> org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
> >        at
> org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
> >        ... 8 more
> > 2011-05-28 23:23:35,292 WARN
> > org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable
> > location to assign region -ROOT-,,0.70236052
> >
> >
> >
> > III. Here is the zk status from http://localhost:60010/zk.jsp
> >
> > HBase is rooted at /hbase
> > Master address: sean-PowerEdge:60000
> > Region server holding ROOT: null
> > Region servers:
> >  sean-PowerEdge:60020
> > Quorum Server Statistics:
> >  localhost:2181
> >  Zookeeper version: 3.3.2-CDH3B4--1, built on 02/21/2011 20:16 GMT
> >  Clients:
> >   /127.0.0.1:42221[0](queued=0,recved=1,sent=0)
> >   /127.0.0.1:44071[1](queued=0,recved=39,sent=44)
> >   /127.0.0.1:44078[1](queued=0,recved=23,sent=24)
> >   /127.0.0.1:44085[1](queued=0,recved=23,sent=23)
> >   /127.0.0.1:44077[1](queued=0,recved=19,sent=19)
> >
> >  Latency min/avg/max: 0/6/164
> >  Received: 105
> >  Sent: 110
> >  Outstanding: 0
> >  Zxid: 0x148
> >  Mode: standalone
> >  Node count: 12
> >
> >
> > What's the problem causing the above symptom?
> >
> > Thanks,
> > --
> > --Sean
> >
>



-- 
--Sean

Mime
View raw message