hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Leffel" <daniel.lef...@gmail.com>
Subject Re: Problem caused by HBASE-617 - Blocks missing from hlog.dat
Date Wed, 21 May 2008 15:12:51 GMT
After letting the hbase attempt to assign regions all night long, I awoke to
an unassigned meta region. Here is the master log:

Desc: {name: .META., families: {info:={name: info, max versions: 1,
compression: NONE, in memory: false, max length: 2147483647, bloom filter:
none}}}
2008-05-21 11:10:29,560 DEBUG org.apache.hadoop.hbase.HMaster: Main
processing loop: ProcessRegionClose of .META.,,1, true
2008-05-21 11:10:29,560 INFO org.apache.hadoop.hbase.HMaster: region closed:
.META.,,1
2008-05-21 11:10:29,560 INFO org.apache.hadoop.hbase.HMaster: reassign
region: .META.,,1
2008-05-21 11:10:29,729 INFO org.apache.hadoop.hbase.HMaster: assigning
region .META.,,1 to server 10.252.242.159:60020
2008-05-21 11:10:32,740 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : .META.,,1 from 10.252.242.159:60020
2008-05-21 11:10:32,740 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_CLOSE : .META.,,1 from 10.252.242.159:60020
2008-05-21 11:10:32,740 INFO org.apache.hadoop.hbase.HMaster:
10.252.242.159:60020 no longer serving regionname: .META.,,1, startKey: <>,
endKey: <>, encodedName: 1028785192, tableDesc: {name: .META., families:
{info:={name: info, max versions: 1, compression: NONE, in memory: false,
max length: 2147483647, bloom filter: none}}}
2008-05-21 11:10:32,740 DEBUG org.apache.hadoop.hbase.HMaster: Main
processing loop: ProcessRegionClose of .META.,,1, true
2008-05-21 11:10:32,741 INFO org.apache.hadoop.hbase.HMaster: region closed:
.META.,,1
2008-05-21 11:10:32,741 INFO org.apache.hadoop.hbase.HMaster: reassign
region: .META.,,1
2008-05-21 11:10:34,585 INFO org.apache.hadoop.hbase.HMaster: assigning
region .META.,,1 to server 10.254.30.79:60020
2008-05-21 11:10:37,595 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : .META.,,1 from 10.254.30.79:60020
2008-05-21 11:10:37,596 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_CLOSE : .META.,,1 from 10.254.30.79:60020
2008-05-21 11:10:37,596 INFO org.apache.hadoop.hbase.HMaster:
10.254.30.79:60020 no longer serving regionname: .META.,,1, startKey: <>,
endKey: <>, encodedName: 1028785192, tableDesc: {name: .META., families:
{info:={name: info, max versions: 1, compression: NONE, in memory: false,
max length: 2147483647, bloom filter: none}}}
2008-05-21 11:10:37,596 DEBUG org.apache.hadoop.hbase.HMaster: Main
processing loop: ProcessRegionClose of .META.,,1, true


and here is the regin server side:
2008-05-21 11:12:38,027 INFO org.apache.hadoop.hbase.HRegionServer:
MSG_REGION_OPEN : .META.,,1
2008-05-21 11:12:38,027 DEBUG org.apache.hadoop.hbase.HRegion: Opening
region .META.,,1/1028785192
2008-05-21 11:12:38,035 DEBUG org.apache.hadoop.hbase.HStore: loaded
/hbase/.META./1028785192/info/info/2509658022189995817, isReference=false
2008-05-21 11:12:38,036 DEBUG org.apache.hadoop.hbase.HStore: loaded
/hbase/.META./1028785192/info/info/8183182393002383771, isReference=false
2008-05-21 11:12:38,049 DEBUG org.apache.hadoop.hbase.HStore: Loaded 2
file(s) in hstore 1028785192/info, max sequence id 917793587
2008-05-21 11:12:38,116 ERROR org.apache.hadoop.hbase.HRegionServer: error
opening region .META.,,1
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1434)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1411)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1400)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1395)
        at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:263)
        at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242)
        at
org.apache.hadoop.hbase.HStoreFile$HbaseMapFile$HbaseReader.<init>(HStoreFile.java:554)
        at
org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:609)
        at org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:382)
        at org.apache.hadoop.hbase.HStore.<init>(HStore.java:849)
        at org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:431)
        at
org.apache.hadoop.hbase.HRegionServer.openRegion(HRegionServer.java:1258)
        at
org.apache.hadoop.hbase.HRegionServer$Worker.run(HRegionServer.java:1204)
        at java.lang.Thread.run(Thread.java:619)



On Tue, May 20, 2008 at 4:19 PM, stack <stack@duboce.net> wrote:

> Daniel Leffel wrote:
>
>> After experiencing a region server that would not exit (HBASE-617), I
>> tried
>> to bring back up hbase (after first having shut down and bringing back up
>> DFS).
>>
>> There are around 370 regions. The first 250 were assigned to region
>> servers
>> within 5 minutes of startup. The rest of the regions took the better part
>> of
>> the day to become assigned to a region server. A quick inspection of the
>> regionserver logs were showing messages like the following:
>>
>> 2008-05-20 18:33:46,964 DEBUG org.apache.hadoop.hbase.HMaster: Received
>> MSG_REPORT_PROCESS_OPEN : categories,2864153,1211005494348 from
>> 10.254.26.31:60020
>>
>>
>
> These messages are sent over to the master by the regionserver as a kind of
> ping saying "I'm still alive and working on whatever it was you gave me".
>
> Can you tell what was happening by looking in regionserver logs?
>
> Was it that all regions had been given to a single regionserver and it was
> busy replaying edits before bringing the regions online (There is a single
> worker thread per regionserver.  If lots of edits to replay, can take
> seconds to minutes to bring on a region).
>
> Did the regions come online gradually or all in a lump?
>
>> After waiting for all the regions to be assigned (and an absence of the
>> above message appearing in the log), I started a MapReduce job that
>> iterates
>> over all regions. Immediately, the above mentioned region began to show up
>> in the logs again with the above message and the job failed with an
>> IOException because it couldn't locate blocks.
>>
>> I ran fsck on /hbase and sure enough, blocks are missing from the
>> following
>> file (although it reports a size of 0 as what's missing - I presume it
>> just
>> doesn't know):
>>
>> /hbase/log_10.254.30.79_1211300015031_60020/hlog.dat.000
>>
>>
> The above looks like the innocuous messages described in
> https://issues.apache.org/jira/browse/HBASE-509.
>
> St.Ack
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message