hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3408) AssignmentManager NullPointerException
Date Wed, 05 Jan 2011 21:42:48 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977980#action_12977980
] 

Hudson commented on HBASE-3408:
-------------------------------

Integrated in HBase-TRUNK #1703 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1703/])
    

> AssignmentManager NullPointerException
> --------------------------------------
>
>                 Key: HBASE-3408
>                 URL: https://issues.apache.org/jira/browse/HBASE-3408
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.0
>            Reporter: Matt Corgan
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3408[0.90.0].patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> If AssignmentManager tries to move a region to an invalid destination server, rather
than choosing a random server as intended, it throws an NPE.
> Line 1009 should check if existingPlan.getDestination()!=null:
>  if (existingPlan == null || forceNewPlan ||
>           (existingPlan.getDestination() != null && existingPlan.getDestination().equals(serverToExclude)))
{
> I triggered it by trying to manually move regions around, probably to an invalid destination
server.  I'm not currently able to build the project to test if that's the extent of the problem,
so here's a little more info...  
> It leaves a stranded region-in-transition until the master and/or regionserver are restarted
and causes problems like the following.  "hbck -fix" was unable to repair it.
> 2011-01-04 00:14:10,948 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned
4287 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-01-04 00:14:18,574 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer
because 1 region(s) in transition: {23ebce9a5d174f87bfb96ed1da387fdc=RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc.
state=OFFLINE, ts=1294118046139}
> 2011-01-04 00:14:36,142 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions
in transition timed out:  RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc. state=OFFLINE,
ts=1294118046139
> 2011-01-04 00:14:36,142 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region
has been OFFLINE for too long, reassigning RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc.
to a random server
> 2011-01-04 00:14:36,142 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing
OFFLINE; was=RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc. state=OFFLINE, ts=1294118046139
> 2011-01-04 00:14:36,142 ERROR org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
Caught exception
> java.lang.NullPointerException
>         at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:934)
(i think this is .90.0RC1, so same bug on a different line number)
>         at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:909)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:822)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:663)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:643)
>         at org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1481)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:66)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message