hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-3446) ProcessServerShutdown fails if META moves, orphaning lots of regions
Date Thu, 13 Oct 2011 23:38:12 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-3446:
-------------------------

      Resolution: Fixed
    Release Note: Makes catalog/* classes retry: e.g. MetaEditor, MetaReader and CatalogTracker.
 Previously they would try once and unless successful, fail.  Retrying is courtesy of HTable
instances.
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Got all tests to pass, eventually.

A bunch of tests were failing because the waitForMeta just hung on the meta-is-available boolean
on master startup waiting for some background thread to set it true when meta had been set.
 This was fine in old days when we'd go get an HRegionInterface to the .META. and try and
ensure it is in its wherever location with verifies over the HRegionInterface instances (with
no retries) but now we don't do such primitives, we've gone up the stack, and have HTables/HConnections
do search and 'verify' of meta for us.  We need to run a connection get to know if meta is
available (if it is available, the magic atomicboolean gets set).

Other miscellaneous stuff like testshell was failing for me because couldn't find cluster
-- need to set it with the cluster's configuration.

Moved more of the meta migration code into the MetaMigrationRemoveHTD class rather than have
it spread all about.

Changed the LocalHBaseCluster#join method so it uses the old threaddumping join which will
dump out a thread dump if we are waiting on something > 60 seconds to finish.  Helped me
debug a few tests here.

Otherwise, was what was up on rb.
                
> ProcessServerShutdown fails if META moves, orphaning lots of regions
> --------------------------------------------------------------------
>
>                 Key: HBASE-3446
>                 URL: https://issues.apache.org/jira/browse/HBASE-3446
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: 3446-v11.txt, 3446-v12.txt, 3446-v13.txt, 3446-v14.txt, 3446-v2.txt,
3446-v3.txt, 3446-v4.txt, 3446-v7.txt, 3446-v9.txt, 3446.txt, 3446v15.txt, 3446v23.txt
>
>
> I ran a rolling restart on a 5 node cluster with lots of regions, and afterwards had
LOTS of regions left orphaned. The issue appears to be that ProcessServerShutdown failed because
the server hosting META was restarted around the same time as another server was being processed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message