hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramkrishna vasudevan <ramkrishna.s.vasude...@gmail.com>
Subject Re: Follow-up to regionservers not being online - more logs included
Date Fri, 19 Oct 2012 15:53:02 GMT
Can you attach the Master logs also.  Looks that the ROOT region assignment
failed.  This seems to be the first problem.


On Fri, Oct 19, 2012 at 7:11 PM, Dan Brodsky <danbrodsky@gmail.com> wrote:

> I'm still having several issues with my cluster. This used to all
> work, and there have been no recent configuration changes.
> To recap, Master and regionservers all appear to start successfully,
> but several regionservers do not show as online on Hbase master status
> page. Moreover, there appear to be a bunch of regions stuck in
> transition that never open. Of the 5 regions currently on the status
> page, only two have a numberOfOnlineRegions > 0.
> Log file snippets:
> First, the ZooKeeper Dump from off the master status web page shows
> that some of the regionservers have connected to ZK, but they still
> don't show as being online. Note that the IP ending in 217 is the
> Hbase master, the ones ending in 31-40 are RS's 1-10 respectively:
> http://paste.ee/p/JAUfJ
> This is the log file for one of the regionservers that did not come
> online, showing not much of anything, I'm afraid:
> http://paste.ee/p/KHgOP
> In one of the RegionServers that did come online, I'm seeing this
> error repeat over and over (several of the RS_ZK_REGION_OPENING debug
> statements precede the error): http://paste.ee/p/lbiTN
> ZooKeeper log for one of the ZK nodes. Not much remarkable here; the
> nodes connect successfully, and there's a repeat opening/closing of a
> session with the Hbase master (which is also a ZK quorum peer):
> http://paste.ee/p/zjSCO
> The master log doesn't show much. A lot this:
> CatalogTracker: Failed verification of .META.,,1 at
> address=dn-4,60020,1350563250999;
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Region is not
> online: .META.,,1
> But then it does find .META. and open it on a different RS:
> 2012-10-19 12:59:21,480 INFO
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling
> OPENED event for .META.,,1.1028785192 from dn-3,60020,1350651496690;
> deleting unassigned node
> 2012-10-19 12:59:21,482 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: The master has
> opened the region .META.,,1.1028785192 that was online on
> dn-3,60020,1350651496690
> 2012-10-19 12:59:21,497 INFO org.apache.hadoop.hbase.master.HMaster:
> .META. assigned=2, rit=false, location=dn-3,60020,1350651496690
> The master log file goes on to show that 71 regions come online, which
> is consistent with the master status page.
> Thoughts?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message