hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans" <jdcry...@apache.org>
Subject Re: Hbase corrupts data after reporting MSG_REPORT_CLOSE to master during compaction and split process
Date Thu, 04 Sep 2008 15:41:20 GMT
Cosmin,

Is the NPE something like this?

2008-08-28 19:11:50,181 INFO org.apache.hadoop.hbase.regionserver.HRegion:
starting compaction on region web_pages,
http://automotive.autoaubaine.com/occasions-auto/Saguenay-Lac-St-Jean-listing.html,1219963284600
2008-08-28 19:11:50,228 ERROR
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction failed
for region web_pages,
http://automotive.autoaubaine.com/occasions-auto/Saguenay-Lac-St-Jean-listing.html,1219963284600
java.lang.NullPointerException
        at
org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:768)
        at
org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:912)
        at
org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:870)
        at
org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:84)

This region is now corrupted. We see stuff like that doing a get in the
shell:

08/09/04 11:27:44 DEBUG client.HConnectionManager$TableServers: reloading
table servers because: java.io.IOException: Cannot open filename
/hbase/amsterdam_development/web_pages/798949571/attribute/mapfiles/7908375253086018803/data
        at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1140)
        at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1442)
        at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1281)
        at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1385)
        at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1337)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at
org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1847)
        at
org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1877)
        at
org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1782)
        at
org.apache.hadoop.io.MapFile$Reader.seekInternal(MapFile.java:463)
        at org.apache.hadoop.io.MapFile$Reader.getClosest(MapFile.java:558)
        at org.apache.hadoop.io.MapFile$Reader.getClosest(MapFile.java:541)
        at
org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile$Reader.getClosest(HStoreFile.java:757)
...

I guess that it happened after the region got this:

2008-08-28 18:42:08,142 WARN org.apache.hadoop.dfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/hbase/amsterdam_development/web_pages/compaction.dir/1338764398/attribute/mapfiles/5346237486847461749/data
File does not exist. [Lease.  Holder: 44 46 53 43 6c 69 65 6e 74 5f 2d 31 39
31 39 31 31 30 38 30 36, heldlocks: 0, pendingcreates: 5]
        at
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1194)
        at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1125)
        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300)
        at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

        at org.apache.hadoop.ipc.Client.call(Client.java:557)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
        at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2335)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2220)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1700(DFSClient.java:1702)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1842)

2008-08-28 18:42:08,161 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery
for block blk_-8223656433789893214 bad datanode[0]
2008-08-28 18:42:08,162 ERROR
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction/Split
failed for region web_pages,
http://automotive.autoaubaine.com/occasions-auto/Saguenay-Lac-St-Jean-listing.html,1219963284600
java.io.IOException: Could not get block locations. Aborting...
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2081)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)


This seems like a serious problem.

J-D

On Thu, Sep 4, 2008 at 11:25 AM, Cosmin Lehene <clehene@adobe.com> wrote:

>
>
>
> On 9/3/08 6:40 PM, "stack" <stack@duboce.net> wrote:
>
> > Cosmin Lehene wrote:
> >> Hi,
> >>
> >> Thanks for explaining things.
> >> HBase Version    0.2.0, r690650
> >>
> >
> > Is that the released 0.2.0?
> >
> It was a build from the 0.2 branch.
> >> Hadoop Version    0.17.2, r684969
> >>
> >> It's hard to reproduce that. Right now HDFS gets corrupted after we
> start our
> >> insertion scripts.  I've enabled DEBUG however. I'll retry fresh with an
> >> upgrade  to hadoop 0.18 and Hbase trunk.
> >>
> >
> > I would be interested in seeing logs of a corrupt hbase especially when
> > we're under the illusion that we left 'corruption' long behind us.
> >> However we might start  digging more in the code to see if we can start
> >> patching things.
> >> Are there any issues with HDFS as well or it's just Hbase that needs
> fixing?
> >>
>
> I upgraded the cluster to HBase trunk "0.3.0"(I guess this should be  0.18)
>  and Hadoop 0.18 release. We're still pushing data, but so far it looks
> better than 0.2.0, however region servers die with
>
> 2008-09-03 10:17:20,457 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: Ran out of memory
> java.lang.OutOfMemoryError: Java heap space
>
> And the data got corrupted again with debug information only on the master
> server. Apparently it started a compaction and failed with
> NullPointerException leaving the region corrupted. Unfortunately the
> CompactSplitThread only logs the exception, not the stack trace.
>
> Most of the problems appear when inserting heavily (between 1000 - 2000
> requests/s). Under light insert rate (<400 requests/s) it behaves better.
>
> I recompiled now with more logging and started to load data again.
> Hopefully we will cross the 20M boundary.
>
> Cosmin
> >
> > Usually its HBase.  But if HDFS is sickly, HBase is too.
> >
> > St.Ack
> >
> >> Thanks,
> >> Cosmin
> >>
> >> On 9/2/08 10:14 PM, "stack" <stack@duboce.net> wrote:
> >>
> >> Thanks for digging in Cosmin.   What version of hbase?
> >>
> >> FYI, to repair the table, there is a tool to merge adjacent regions:
> >>
> >> $ ./bin/hbase org.apache.hadoop.hbase.util.Merge
> >> Usage: bin/hbase merge <table-name> <region-1> <region-2>
> >>
> >> See below for comments inline:
> >>
> >> Cosmin Lehene wrote:
> >>
> >>> ...
> >>> HMaster 10.131.237.52 log
> >>>
> >>> grep "users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6"
> >>> /usr/local/hbase/logs/hbase-hadoop-master-dev2.log.2008-08-29
> >>> 2008-08-29 07:58:59,660 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 to
> >>> server 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,728 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,881 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,881 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 open on
> >>> 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,881 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 in region
> .META.,,1
> >>> with startcode 1220014953935 and server 10.131.237.54:60020
> >>> M1 2008-08-29 08:50:18,558 INFO
> >>> org.apache.hadoop.hbase.master.RegionManager: assigning region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 to server
> >>> 10.131.237.51:60020
> >>>
> >>>
> >>
> >> What happened in here to cause the master reassign a region already
> >> assigned?  It was on 54 then we are asked assign it to 51.  Would
> >> suggest you run with DEBUG.  Would help with the diagnosis (See FAQ for
> >> how).
> >>
> >>
> >>> 2008-08-29 08:50:21,935 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:50:25,086 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> ...
> >>> 2008-08-29 08:53:06,020 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:53:06,020 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> M2 2008-08-29 08:53:36,875 INFO
> >>> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>>
> >>>
> >>
> >> The multiple instances of MSG_REPORT_PROCESS_OPEN would imply it was
> >> taking the redeployment a long time to come up replaying edits (If its
> >> replaying edits of a region already deployed, that'll make for a real
> mess).
> >>
> >>
> >>
> >>> 2008-08-29 08:53:36,876 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 open on
> >>> 10.131.237.51:60020
> >>> M3 2008-08-29 08:53:36,876 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 in region
> .META.,,1
> >>> with startcode 1220014954000 and server 10.131.237.51:60020
> >>> 2008-08-29 08:54:12,249 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> Skipping region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because it is already closing.
> >>> 2008-08-29 08:54:15,525 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> Skipping region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because it is already closing.
> >>> 2008-08-29 08:54:18,581 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> Skipping region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because it is already closing.
> >>> M4 2008-08-29 08:54:24,616 INFO
> >>> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> M5 2008-08-29 08:54:24,937 INFO
> >>> org.apache.hadoop.hbase.master.RegionManager: assigning region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 to server
> >>> 10.131.237.55:60020
> >>> 2008-08-29 08:54:28,133 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.55:60020
> >>>
> >>>
> >>
> >> Now we are assigning the region to 55.  I don't know why the
> >> MSG_REPORT_CLOSE happened over on 51.  Doesn't say in the regionserver
> >> log.    Our logging needs improving.  Without DEBUG its not possible to
> >> know why some of these actions happened.
> >>
> >> St.Ack
> >>
> >>
> >>> 2008-08-29 08:54:28,134 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.55:60020
> >>> 2008-08-29 08:54:28,221 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 open on
> >>> 10.131.237.55:60020
> >>> M6 2008-08-29 08:54:28,221 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 in region
> .META.,,1
> >>> with startcode 1220014319248 and server 10.131.237.55:60020
> >>> M7 2008-08-29 08:54:30,775 INFO
> >>> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_SPLIT:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774: [B@4861ef74from
> >>> 10.131.237.51:60020
> >>> M8 2008-08-29 08:54:33,356 INFO
> >>> org.apache.hadoop.hbase.master.RegionManager: assigning region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 to server
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:54:36,419 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:54:39,498 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> ...
> >>> 2008-08-29 08:55:03,727 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> M9 2008-08-29 08:55:04,835 INFO
> >>> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_SPLIT:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774: [B@62927d48from
> >>> 10.131.237.55:60020
> >>> 2008-08-29 08:55:06,898 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> ...
> >>> 2008-08-29 08:56:04,627 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> M10 2008-08-29 08:56:04,627 INFO
> >>> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:56:04,642 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 open on
> >>> 10.131.237.52:60020
> >>> M11 2008-08-29 08:56:04,642 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 in region
> .META.,,1
> >>> with startcode 1220014954468 and server 10.131.237.52:60020
> >>> 2008-08-29 08:56:10,645 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:56:16,111 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 to
> >>> server 10.131.236.91:60020
> >>> 2008-08-29 08:56:19,134 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:19,134 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:19,134 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 open on
> >>> 10.131.236.91:60020
> >>> M12 2008-08-29 08:56:19,134 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 in region
> .META.,,1
> >>> with startcode 1220014953772 and server 10.131.236.91:60020
> >>> 2008-08-29 08:56:43,935 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 to
> >>> server 10.131.236.91:60020
> >>> 2008-08-29 08:56:47,158 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:50,274 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:53,421 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:53,422 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:53,500 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 open on
> >>> 10.131.236.91:60020
> >>> M13 2008-08-29 08:56:53,500 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 in region
> .META.,,1
> >>> with startcode 1220014953772 and server 10.131.236.91:60020
> >>> 2008-08-29 08:57:02,485 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:57:04,188 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 to
> >>> server 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 open on
> >>> 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 in region
> .META.,,1
> >>> with startcode 1220024985751 and server 10.131.237.53:60020
> >>> 2008-08-29 09:00:45,252 INFO
> org.apache.hadoop.hbase.master.BaseScanner:
> >>> Deleting region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because daughter splits no longer hold references
> >>> [hadoop@ice-dev2-buc ~]$ grep
> "users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6"
> >>>
> /usr/local/hbase/logs/hbase-hadoop-master-ice-dev2-buc.corp.adobe.com.log.20
> >>> 08-08-29
> >>> 2008-08-29 07:58:59,660 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 to
> >>> server 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,728 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,881 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,881 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 open on
> >>> 10.131.237.54:60020
> >>> 2008-08-29 07:59:02,881 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 in region
> .META.,,1
> >>> with startcode 1220014953935 and server 10.131.237.54:60020
> >>> 2008-08-29 08:50:18,558 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 to
> >>> server 10.131.237.51:60020
> >>> 2008-08-29 08:50:21,935 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:50:25,086 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> ...
> >>> 2008-08-29 08:53:06,020 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:53:36,875 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:53:36,876 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 open on
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:53:36,876 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 in region
> .META.,,1
> >>> with startcode 1220014954000 and server 10.131.237.51:60020
> >>> 2008-08-29 08:54:12,249 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> Skipping region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because it is already closing.
> >>> 2008-08-29 08:54:15,525 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> Skipping region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because it is already closing.
> >>> 2008-08-29 08:54:18,581 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> Skipping region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because it is already closing.
> >>> 2008-08-29 08:54:24,616 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:54:24,937 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 to
> >>> server 10.131.237.55:60020
> >>> 2008-08-29 08:54:28,133 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.55:60020
> >>> 2008-08-29 08:54:28,134 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 from
> >>> 10.131.237.55:60020
> >>> 2008-08-29 08:54:28,221 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 open on
> >>> 10.131.237.55:60020
> >>> 2008-08-29 08:54:28,221 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 in region
> .META.,,1
> >>> with startcode 1220014319248 and server 10.131.237.55:60020
> >>> 2008-08-29 08:54:30,775 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_SPLIT:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774: [B@4861ef74from
> >>> 10.131.237.51:60020
> >>> 2008-08-29 08:54:33,356 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 to
> >>> server 10.131.237.52:60020
> >>> 2008-08-29 08:54:36,419 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:54:39,498 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> ...
> >>> 2008-08-29 08:55:03,727 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:55:04,835 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_SPLIT:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774: [B@62927d48from
> >>> 10.131.237.55:60020
> >>> 2008-08-29 08:55:06,898 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> ...
> >>> 2008-08-29 08:56:04,627 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:56:04,627 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:56:04,642 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 open on
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:56:04,642 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 in region
> .META.,,1
> >>> with startcode 1220014954468 and server 10.131.237.52:60020
> >>> 2008-08-29 08:56:10,645 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.52:60020
> >>> 2008-08-29 08:56:16,111 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 to
> >>> server 10.131.236.91:60020
> >>> 2008-08-29 08:56:19,134 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:19,134 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:19,134 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 open on
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:19,134 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 in region
> .META.,,1
> >>> with startcode 1220014953772 and server 10.131.236.91:60020
> >>> 2008-08-29 08:56:43,935 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 to
> >>> server 10.131.236.91:60020
> >>> 2008-08-29 08:56:47,158 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:50,274 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:53,421 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:53,422 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:53,500 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 open on
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:56:53,500 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220024667454 in region
> .META.,,1
> >>> with startcode 1220014953772 and server 10.131.236.91:60020
> >>> 2008-08-29 08:57:02,485 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.236.91:60020
> >>> 2008-08-29 08:57:04,188 INFO
> org.apache.hadoop.hbase.master.RegionManager:
> >>> assigning region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 to
> >>> server 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_PROCESS_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> org.apache.hadoop.hbase.master.ServerManager:
> >>> Received MSG_REPORT_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 from
> >>> 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 open on
> >>> 10.131.237.53:60020
> >>> 2008-08-29 08:57:07,309 INFO
> >>> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231 in region
> .META.,,1
> >>> with startcode 1220024985751 and server 10.131.237.53:60020
> >>> M14 2008-08-29 09:00:45,252 INFO
> org.apache.hadoop.hbase.master.BaseScanner:
> >>> Deleting region
> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> because daughter splits no longer hold references
> >>>
> >>>
> >>> Region server log 10.131.237.51
> >>>
> >>> grep users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6
> >>> /usr/local/hbase/logs/hbase-hadoop-regionserver-dev1.log.2008-08-29
> >>> 2008-08-29 08:50:18,868 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> 2008-08-29 08:52:52,323 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> R1 2008-08-29 08:53:34,330 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion: region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774/62578664
> available
> >>> R2 2008-08-29 08:53:34,330 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on
> region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> R3 2008-08-29 08:54:09,035 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774: [B@284f4565
> >>> R4 2008-08-29 08:54:24,224 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
> region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 in 49sec
> >>> R5 2008-08-29 08:54:24,224 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion: Starting split of region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> R6 2008-08-29 08:54:24,234 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion: closed
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774
> >>> R7 2008-08-29 08:54:24,393 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774: [B@284f4565
> >>> R8 2008-08-29 08:54:24,393 WARN
> >>> org.apache.hadoop.hbase.regionserver.HRegion: region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774 already closed
> >>> R9 2008-08-29 08:54:27,323 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion: region
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231/620189721
> available
> >>> R10 2008-08-29 08:54:27,324 INFO
> >>> org.apache.hadoop.hbase.regionserver.HRegion: closed
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231
> >>> R11 2008-08-29 08:54:27,726 INFO
> >>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split,
> META
> >>> updated, and report to master all successful. Old region=REGION =>
> {NAME =>
> >>> 'users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220021933774', STARTKEY =>
> >>> '27B5C924-6FE4-4484-B131-D5C9E71AF4A6', ENDKEY =>
> >>> '2FB0249D-0A95-47BB-A4F1-6C7600039542', ENCODED => 62578664, OFFLINE
=>
> >>> true, SPLIT => true, TABLE => {{NAME => 'users', IS_ROOT =>
'false',
> IS_META
> >>> => 'false', FAMILIES => [{NAME => 'test', BLOOMFILTER => 'false',
> >>> COMPRESSION => 'NONE', VERSIONS => '3', LENGTH => '2147483647',
TTL =>
> '-1',
> >>> IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}, new regions:
> >>> users,27B5C924-6FE4-4484-B131-D5C9E71AF4A6,1220025264231,
> >>> users,2BB6AC25-2EA6-4311-9670-49B68B7BB975,1220025264231. Split took
> 3sec
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message