hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liang xie (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-3834) Store ignores checksum errors when opening files
Date Fri, 14 Sep 2012 11:14:07 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

liang xie updated HBASE-3834:

    Attachment: hbase-3834.tar.gz2

Hi all, i ran a manual test today, it turned out this issue should be gone away in latest
version, at least for 0.94.0

Here is my test details:
0)My env: ubuntu 10.10, hbase-0.94.0 release, standalone mode
1)start HBase
2)from hbase shell:
hbase(main):002:0> status
1 servers, 0 dead, 2.0000 average load

hbase(main):003:0> version
0.94.0, r1332822, Tue May  1 21:43:54 UTC 2012

hbase(main):004:0> create 'test','cf'
0 row(s) in 1.1520 seconds

hbase(main):005:0> list 'test'
1 row(s) in 0.0230 seconds

hbase(main):010:0> put 'test','row1','cf:a','value1'
0 row(s) in 0.0160 seconds

hbase(main):011:0> put 'test','row2','cf:b','value2'
0 row(s) in 0.0070 seconds

hbase(main):012:0> put 'test','row3','cf:c','value3'
0 row(s) in 0.0070 seconds

hbase(main):017:0> scan 'test'
ROW                                                          COLUMN+CELL                 
 row1                                                        column=cf:a, timestamp=1347619555652,
 row2                                                        column=cf:b, timestamp=1347619562943,
 row3                                                        column=cf:c, timestamp=1347619576704,
3 row(s) in 0.0310 seconds

hbase(main):027:0> flush 'test'
0 row(s) in 0.0770 seconds

hbase(main):028:0> exit

3)shutdown hbase,then edit the according files with vim

4)start hbase again

5)from log file, we can see:
2012-09-14 18:54:15,096 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes
set to 0ms in store null
2012-09-14 18:54:15,118 INFO org.apache.hadoop.fs.FSInputChecker: Found checksum error: b[0,
org.apache.hadoop.fs.ChecksumException: Checksum error: file:/tmp/hbase/test/4c9f2cfda63b2e9785815ed2e841d052/cf/269b59832d68465687ebce880026a301
at 512
        at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)
        at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)
        at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
2012-09-14 18:54:15,123 INFO org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
Opening of region {NAME => 'test,,1347619334483.4c9f2cfda63b2e9785815ed2e841d052.', STARTKEY
=> '', ENDKEY => '', ENCODED => 4c9f2cfda63b2e9785815ed2e841d052,} failed, marking
2012-09-14 18:54:15,123 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:48394-0x139c46a10d40001
Attempting to transition node 4c9f2cfda63b2e9785815ed2e841d052 from RS_ZK_REGION_OPENING to
2012-09-14 18:54:15,137 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:48394-0x139c46a10d40001
Successfully transitioned node 4c9f2cfda63b2e9785815ed2e841d052 from RS_ZK_REGION_OPENING
2012-09-14 18:54:15,138 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_FAILED_OPEN,
server=xieliang,48394,1347620050131, region=4c9f2cfda63b2e9785815ed2e841d052
2012-09-14 18:54:15,141 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler:
Handling CLOSED event for 4c9f2cfda63b2e9785815ed2e841d052
2012-09-14 18:54:15,141 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
was=test,,1347619334483.4c9f2cfda63b2e9785815ed2e841d052. state=CLOSED, ts=1347620055124,
2012-09-14 18:54:15,141 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:41380-0x139c46a10d40000
Creating (or updating) unassigned node for 4c9f2cfda63b2e9785815ed2e841d052 with OFFLINE state
> Store ignores checksum errors when opening files
> ------------------------------------------------
>                 Key: HBASE-3834
>                 URL: https://issues.apache.org/jira/browse/HBASE-3834
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.2
>            Reporter: Todd Lipcon
>            Assignee: liang xie
>            Priority: Critical
>             Fix For: 0.90.8
>         Attachments: hbase-3834.tar.gz2
> If you corrupt one of the storefiles in a region (eg using vim to muck up some bytes),
the region will still open, but that storefile will just be ignored with a log message. We
should probably not do this in general - better to keep that region unassigned and force an
admin to make a decision to remove the bad storefile.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message