hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-6310) -ROOT- corruption when .META. is using the old encoding scheme
Date Fri, 20 Jul 2012 23:17:35 GMT

     [ https://issues.apache.org/jira/browse/HBASE-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jean-Daniel Cryans resolved HBASE-6310.
---------------------------------------

       Resolution: Invalid
    Fix Version/s:     (was: 0.94.2)
                       (was: 0.96.0)

I'm resolving this as invalid, I was thrown in the wrong direction by what I thought were
old/new .META. rows (they in fact never changed) whereas it was a .META. region from almost
3 years ago that was brought back to life. It could have been something like HBASE-6417 that
happened, but since I don't have those logs anymore I can't be 100% sure until I reproduce
the issue.
                
> -ROOT- corruption when .META. is using the old encoding scheme
> --------------------------------------------------------------
>
>                 Key: HBASE-6310
>                 URL: https://issues.apache.org/jira/browse/HBASE-6310
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.94.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>
> We're still working the on the root cause here, but after the leap second armageddon
we had a hard time getting our 0.94 cluster back up. This is what we saw in the logs until
the master died by itself:
> {noformat}
> 2012-07-01 23:01:52,149 DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> locateRegionInMeta parentTable=-ROOT-,
> metaLocation={region=-ROOT-,,0.70236052, hostname=sfor3s28,
> port=10304}, attempt=16 of 100 failed; retrying after sleep of 32000
> because: HRegionInfo was null or empty in -ROOT-,
> row=keyvalues={.META.,,1259448304806/info:server/1341124914705/Put/vlen=14/ts=0,
> .META.,,1259448304806/info:serverstartcode/1341124914705/Put/vlen=8/ts=0}
> {noformat}
> (it's strage that we retry this)
> This was really misleading because I could see the regioninfo in a scan:
> {noformat}
> hbase(main):002:0> scan '-ROOT-'
> ROW                                           COLUMN+CELL
>  .META.,,1                                    column=info:regioninfo,
> timestamp=1331755381142, value={NAME => '.META.,,1', STARTKEY => '',
> ENDKEY => '', ENCODED => 1028785192,}
>  .META.,,1                                    column=info:server,
> timestamp=1341183448693, value=sfor3s40:10304
>  .META.,,1
> column=info:serverstartcode, timestamp=1341183448693,
> value=1341183444689
>  .META.,,1                                    column=info:v,
> timestamp=1331755419291, value=\x00\x00
>  .META.,,1259448304806                        column=info:server,
> timestamp=1341124914705, value=sfor3s24:10304
>  .META.,,1259448304806
> column=info:serverstartcode, timestamp=1341124914705,
> value=1341124455863
> {noformat}
> Except that the devil is in the details, ".META.,,1" is not ".META.,,1259448304806".
Basically something writes to .META. by directly creating the row key without caring if the
row is in the old format. I did a deleteall in the shell and it fixed the issue... until some
time later it was stuck again because the edits reappeared (still not sure why). This time
the PostOpenDeployTasksThread were stuck in the RS trying to update .META. but there was no
logging (saw it with a jstack). I deleted the row again to make it work.
> I'm marking this as a blocker against 0.94.2 since we're trying to get 0.94.1 out, but
I wouldn't recommend upgrading to 0.94 if your cluster was created before 0.89

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message