hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Nazário dos Santos <nazario.lu...@gmail.com>
Subject Re: Data loss
Date Sun, 23 Aug 2009 14:55:21 GMT
Forgot to say that it crashed yesterday right after a count in a table with
19000 thousand registers using HBase shell.

The job running on Hadoop was giving me the exceptions bellow:

org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server /192.168.1.2:60020 for region unknown, row
'unknown', but failed after 1 attempts.
Exceptions:

	at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:425)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:799)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:815)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:870)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:515)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:497)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:628)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:524)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:497)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:628)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:528)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491)
	at org.apache.hadoop.hbase.client.HTable.(HTable.java:123)
	at org.apache.hadoop.hbase.client.HTable.(HTable.java:101)
	at com.nash.ninvest.backend.feed.ReduceFeedsToNewsIntoHbase.setup(Unknown
Source)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:172)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:543)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:410)

ipc.HBaseClient: Retrying connect to server: /192.168.1.2:60020.
Already tried 3 time(s).
2009-08-23 07:14:41,146 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 4
time(s).
2009-08-23 07:14:42,147 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 5
time(s).
2009-08-23 07:14:43,148 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 6
time(s).
2009-08-23 07:14:44,149 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 7
time(s).
2009-08-23 07:14:45,149 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 8
time(s).
2009-08-23 07:14:46,150 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 9
time(s).
2009-08-23 07:14:46,152 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 not available yet, Zzzzz...
2009-08-23 07:14:46,152 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 could not be reached after 1 tries, giving up.
2009-08-23 07:14:47,156 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 0
time(s).
2009-08-23 07:14:48,157 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 1
time(s).
2009-08-23 07:14:49,159 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 2
time(s).
2009-08-23 07:14:50,161 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 3
time(s).
2009-08-23 07:14:51,162 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 4
time(s).
2009-08-23 07:14:52,163 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 5
time(s).
2009-08-23 07:14:53,164 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 6
time(s).
2009-08-23 07:14:54,165 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 7
time(s).
2009-08-23 07:14:55,166 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 8
time(s).
2009-08-23 07:14:56,167 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 9
time(s).
2009-08-23 07:14:56,168 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 not available yet, Zzzzz...
2009-08-23 07:14:56,168 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 could not be reached after 1 tries, giving up.
2009-08-23 07:14:57,185 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 0
time(s).
2009-08-23 07:14:58,186 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 1
time(s).
2009-08-23 07:14:59,187 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 2
time(s).
2009-08-23 07:15:00,188 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 3
time(s).
2009-08-23 07:15:01,189 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 4
time(s).
2009-08-23 07:15:02,190 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 5
time(s).
2009-08-23 07:15:03,191 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 6
time(s).
2009-08-23 07:15:04,192 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 7
time(s).
2009-08-23 07:15:05,193 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 8
time(s).
2009-08-23 07:15:06,194 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 9
time(s).
2009-08-23 07:15:06,195 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 not available yet, Zzzzz...
2009-08-23 07:15:06,195 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 could not be reached after 1 tries, giving up.
2009-08-23 07:15:07,366 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 0
time(s).
2009-08-23 07:15:08,367 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 1
time(s).
2009-08-23 07:15:09,368 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 2
time(s).
2009-08-23 07:15:10,375 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 3
time(s).
2009-08-23 07:15:11,384 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 4
time(s).
2009-08-23 07:15:12,387 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 5
time(s).
2009-08-23 07:15:13,388 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 6
time(s).
2009-08-23 07:15:14,389 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 7
time(s).
2009-08-23 07:15:15,390 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 8
time(s).
2009-08-23 07:15:16,391 INFO org.apache.hadoop.ipc.HBaseClient:
Retrying connect to server: /192.168.1.2:60020. Already tried 9
time(s).
2009-08-23 07:15:16,392 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 not available yet, Zzzzz...
2009-08-23 07:15:16,392 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
/192.168.1.2:60020 could not be reached after 1 tries, giving up.
2009-08-23 07:15:16,395 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server /192.168.1.2:60020 for region unknown, row
'unknown', but failed after 1 attempts.
Exceptions:

	at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:425)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:799)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:815)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:870)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:515)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:497)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:628)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:524)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:497)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:628)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:528)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491)
	at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:123)
	at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:101)
	at com.nash.ninvest.backend.feed.ReduceFeedsToNewsIntoHbase.setup(Unknown
Source)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:172)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:543)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:410)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
2009-08-23 07:15:16,402 INFO org.apache.hadoop.mapred.TaskRunner:
Runnning cleanup for the task

	at org.apache.hadoop.mapred.Child.main(Child.java:170)




On Sun, Aug 23, 2009 at 11:44 AM, Lucas Nazário dos Santos <
nazario.lucas@gmail.com> wrote:

> Chen, I've tried a couple of times and flush is not working for me in this
> particular case (small table, kill -9 right after the table is created).
>
> Andrew, the more frequenty reason for the crash is ZooKeeper (I think).
> This is what is happening to me and I couldn't yet overcome the problem (
> http://issues.apache.org/jira/browse/HBASE-1645). As written in the issue,
> when it happen I have to kill the process.
>
> But anyway, a very simple test you could do is to create a table, insert
> three to five registers, flush it and kill -9 HBase. After restarting HBase
> the table is gone (at least it's gone for me).
>
> I'm running HBase 0.20.0 RC2 over two machines using Ubuntu.
>
> Hadoop logs seem to be fine. But HBase files are full of errors. Let me
> post some of them bellow.
>
> Thanks,
> Lucas
>
>
>
> REGIONSERVER
> 2009-08-23 00:11:15,022 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Roll /hbase/.logs/server2,60020,1250975459853/hlog.dat.1250993474713,
> entries=5, calc$
> 2009-08-23 02:11:15,564 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Roll /hbase/.logs/server2,60020,1250975459853/hlog.dat.1250997075020,
> entries=16, cal$
> 2009-08-23 02:43:56,815 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20006 to sun.nio.ch.SelectionKeyImpl@eaa827
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:43:57,068 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 559225ms, ten times longer than scheduled: 3000
> 2009-08-23 02:43:56,911 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> 1622072927860071680 lease expired
> 2009-08-23 02:43:56,820 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 567081ms, ten times longer than scheduled: 10000
> 2009-08-23 02:43:56,815 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20001 to sun.nio.ch.SelectionKeyImpl@1c5c3fe
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:43:57,857 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
> state: Disconnected, type: None, path: null
> 2009-08-23 02:43:57,888 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
> master for 561806 milliseconds - retrying
> 2009-08-23 02:43:59,455 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:43:59,455 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:43:59,511 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:44468 remote$
> 2009-08-23 02:43:59,511 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:43:59,566 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:44469 remote$
> 2009-08-23 02:43:59,566 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:43:59,614 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20006 to sun.nio.ch.SelectionKeyImpl@15d267a
> java.io.IOException: Session Expired
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-23 02:43:59,621 INFO org.apache.zookeeper.ZooKeeper: Closing
> session: 0x12343f29cb20006
> 2009-08-23 02:44:00,014 INFO org.apache.zookeeper.ClientCnxn: Closing
> ClientCnxn for session: 0x12343f29cb20006
> 2009-08-23 02:44:00,355 INFO org.apache.zookeeper.ClientCnxn: Disconnecting
> ClientCnxn for session: 0x12343f29cb20006
> 2009-08-23 02:44:00,558 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x12343f29cb20006 closed
> 2009-08-23 02:44:00,935 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2009-08-23 02:44:01,341 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20001 to sun.nio.ch.SelectionKeyImpl@894793
> java.io.IOException: Session Expired
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-23 02:44:01,341 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
> state: Expired, type: None, path: null
> 2009-08-23 02:44:01,341 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session
> expired
> 2009-08-23 02:44:04,821 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1813 on 60020, call get([B@17d26f0, row=nazario.lucas@gmail.com,
> maxVersio$
> java.io.IOException: Server not running, aborting
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1764)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:07,985 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> java.io.IOException: Server not running, aborting
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1914)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1899)
>         at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:08,146 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1814 on 60020, call openScanner([B@1af6aa, startRow=
> http://www.prnewswire.$
> java.io.IOException: Server not running, aborting
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1865)
>         at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:08,239 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
> (Retry: 0)
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
>         at java.lang.Thread.run(Thread.java:619)
> 2009-08-23 02:44:09,368 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1816 on 60020, call getClosestRowBefore([B@e3cec, [B@c1bb6b,
> [B@f79945) fr$
> java.io.IOException: Server not running, aborting
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1748)
>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:09,382 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
> master for 573861 milliseconds - retrying
> 2009-08-23 02:44:12,636 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1818 on 60020, call getClosestRowBefore([B@186e7b, [B@1fd8093,
> [B@4748a) f$
> java.io.IOException: Server not running, aborting
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1748)
>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:12,660 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1815 on 60020, call next(-595952201864210105) from
> 192.168.1.3:53998: erro$
> java.io.IOException: Server not running, aborting
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1914)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1899)
>         at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:14,644 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
> (Retry: 1)
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
>         at java.lang.Thread.run(Thread.java:619)
> 2009-08-23 02:44:14,645 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
> master for 579124 milliseconds - retrying
> 2009-08-23 02:44:15,678 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0.0, regions=5, stores=37, storefiles=39, storefile$
> 2009-08-23 02:44:15,678 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
> (Retry: 2)
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
>         at java.lang.Thread.run(Thread.java:619)
> 2009-08-23 02:44:15,679 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Stop was requested,
> clearing the toDo despite of the exception
> 2009-08-23 02:44:15,679 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60020
> 2009-08-23 02:44:17,328 INFO
> org.apache.hadoop.hbase.regionserver.LogFlusher:
> regionserver/192.168.1.3:60020.logFlusher exiting
> 2009-08-23 02:44:17,998 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
> regionserver/192.168.1.3:60020.compactor exiting
> 2009-08-23 02:44:18,108 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
> 2009-08-23 02:44:18,335 INFO org.apache.hadoop.hbase.Leases:
> regionserver/192.168.1.3:60020.leaseChecker closing leases
> 2009-08-23 02:44:18,335 INFO org.apache.hadoop.hbase.Leases:
> regionserver/192.168.1.3:60020.leaseChecker closed leases
> 2009-08-23 02:44:18,560 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1296 on 60020: exiting
> 2009-08-23 02:44:18,560 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1314 on 60020: exiting
> 2009-08-23 02:44:18,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1217 on 60020: exiting
> 2009-08-23 02:44:18,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60020: exiting
> 2009-08-23 02:44:18,831 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60020: exiting
> 2009-08-23 02:44:18,831 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 60020: exiting
> 2009-08-23 02:44:18,831 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60020: exiting
> 2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 13 on 60020: exiting
> 2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 10 on 60020: exiting
> 2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 60020: exiting
> 2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 16 on 60020: exiting
> 2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 18 on 60020: exiting
> 2009-08-23 02:44:37,208 INFO org.apache.hadoop.hdfs.DFSClient:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredExce$
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1317)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1308)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1236)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
>         at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:739)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>         at $Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at $Proxy1.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2875)
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2757)
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2048)
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2234)
>
> 2009-08-23 02:44:37,301 INFO org.apache.hadoop.hdfs.DFSClient: Waiting for
> replication for 10 seconds
> 2009-08-23 02:44:37,302 WARN org.apache.hadoop.hdfs.DFSClient:
> NotReplicatedYetException sleeping
> /hbase/.logs/server2,60020,1250975459853/hlog.dat.125100427$
> 2009-08-23 02:44:37,742 INFO org.apache.hadoop.hdfs.DFSClient:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredExce$
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1317)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1308)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1236)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
>         at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:739)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>         at $Proxy1.addBlock(Unknown Source)
>
> MASTER
> 2009-08-23 02:35:32,736 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@108c1b7
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:35:34,387 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@b92956
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:35:34,410 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:35:34,410 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:50441 remote$
> 2009-08-23 02:35:34,467 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:35:38,463 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:35:38,463 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:50444 remote$
> 2009-08-23 02:35:38,463 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:35:57,987 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scanning meta region {server: 192.168.1.2:60020,
> regionnam$
> 2009-08-23 02:35:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 2 region servers, 0 dead, average load 5.0
> 2009-08-23 02:36:04,528 INFO org.apache.hadoop.hbase.master.ServerManager:
> server2,60020,1250975459853 znode expired
> 2009-08-23 02:36:04,533 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
> 192.168.1.2:60020$
> 2009-08-23 02:36:06,779 INFO org.apache.hadoop.hbase.master.RegionManager:
> META region removed from onlineMetaRegions
> 2009-08-23 02:36:06,779 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of
> server server2,60020,1250975459853: logSplit: false, r$
> 2009-08-23 02:36:36,769 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Splitting 8 hlog(s) in
> hdfs://server2:9000/hbase/.logs/server2,60020,1250975459853
> 2009-08-23 02:36:57,775 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@508204
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:36:57,987 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scanning meta region {server: 192.168.1.2:60020,
> regionnam$
> 2009-08-23 02:36:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 1 region servers, 1 dead, average load 5.0[server2,60020,1250975459853]
> 2009-08-23 02:36:59,487 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:36:59,488 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:42560 remote$
> 2009-08-23 02:36:59,488 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:36:59,657 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@f16ddc
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:37:01,434 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:37:01,434 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:42563 remote$
> 2009-08-23 02:37:01,434 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:37:02,727 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
> 192.168.1.2:60020$
> 2009-08-23 02:37:32,729 INFO org.apache.hadoop.hbase.master.ServerManager:
> server,60020,1250975507256 znode expired
> 2009-08-23 02:37:34,118 INFO org.apache.hadoop.hbase.master.RegionManager:
> -ROOT- region unset (but not set to be reassigned)
> 2009-08-23 02:37:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:38:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:39:15,337 WARN org.apache.hadoop.hbase.regionserver.HLog:
> Empty hlog, continuing: org.apache.hadoop.fs.FileStatus@9d684e10 count=0
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at java.io.DataInputStream.readFully(DataInputStream.java:152)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
>         at
> org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:876)
>         at
> org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802)
>         at
> org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274)
>         at
> org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:492)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:426)
> 2009-08-23 02:39:17,375 INFO org.apache.hadoop.hbase.regionserver.HLog:
> hlog file splitting completed in 189186 millis for
> hdfs://server2:9000/hbase/.logs/se$
> 2009-08-23 02:39:17,376 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete,
> meta reassignment and scanning:
> 2009-08-23 02:39:17,376 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation: ProcessServerShutdown
> setting to unassigned: {server: 192.168.1.3:60020, r$
> 2009-08-23 02:39:17,376 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of
> server server,60020,1250975507256: logSplit: false, ro$
> 2009-08-23 02:39:17,382 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Splitting 9 hlog(s) in
> hdfs://server2:9000/hbase/.logs/server,60020,1250975507256
> 2009-08-23 02:39:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:40:20,775 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@110c0df
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:40:21,585 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@e696f9
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:40:22,432 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:40:22,433 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:43272 remote$
> 2009-08-23 02:40:22,433 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:40:22,854 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:40:22,854 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:43273 remote$
> 2009-08-23 02:40:22,854 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:40:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:41:11,930 INFO org.apache.hadoop.hbase.regionserver.HLog:
> hlog file splitting completed in 114553 millis for
> hdfs://server2:9000/hbase/.logs/se$
> 2009-08-23 02:41:11,930 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete,
> meta reassignment and scanning:
> 2009-08-23 02:41:12,135 INFO
> org.apache.hadoop.hbase.master.RegionServerOperation: ProcessServerShutdown
> reassigning ROOT region
> 2009-08-23 02:41:12,135 INFO org.apache.hadoop.hbase.master.RegionManager:
> -ROOT- region unset (but not set to be reassigned)
> 2009-08-23 02:41:12,135 INFO org.apache.hadoop.hbase.master.RegionManager:
> ROOT inserted into regionsInTransition
> 2009-08-23 02:41:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:43:29,197 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@d89c2
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:43:29,755 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@cf209b
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-23 02:43:31,021 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:43:31,022 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:44084 remote$
> 2009-08-23 02:43:31,081 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server server2/192.168.1.3:2181
> 2009-08-23 02:43:31,081 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.1.3:44085 remote$
> 2009-08-23 02:43:31,091 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:43:31,091 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-23 02:43:58,228 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:44:06,668 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 67 on 60000, call regionServerReport(address: 192.168.1.3:60020,
> startcode$
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>         at
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:250)
>         at
> org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:722)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:09,991 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 68 on 60000, call regionServerReport(address: 192.168.1.3:60020,
> startcode$
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>         at
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:250)
>         at
> org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:722)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:15,594 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 69 on 60000, call regionServerReport(address: 192.168.1.3:60020,
> startcode$
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>         at
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:250)
>         at
> org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:722)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-08-23 02:44:24,441 ERROR org.apache.hadoop.hbase.master.BaseScanner:
> Closing scanner
> java.io.IOException: Call to /192.168.1.3:60020 failed on local exception:
> java.io.EOFException
>         at
> org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:757)
>         at
> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:727)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:328)
>         at $Proxy1.close(Unknown Source)
>         at
> org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:203)
>         at
> org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
>         at
> org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
>         at
> org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:135)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:375)
>         at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:505)
>         at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:449)
> 2009-08-23 02:44:24,474 WARN org.apache.hadoop.hbase.master.BaseScanner:
> Scan one META region: {server: 192.168.1.3:60020, regionname: .META.,,1,
> startKey: <$
> java.io.IOException: java.io.IOException: Server not running, aborting
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1914)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1899)
>         at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
>         at
> org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:191)
>         at
> org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
>         at
> org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
>         at
> org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:135)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
> 2009-08-23 02:44:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 1 .META. region(s) scanned
> 2009-08-23 02:44:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:44:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:45:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:45:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:46:24,475 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:46:58,104 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:47:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:47:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:48:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:48:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:49:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:49:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:50:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:50:58,104 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:51:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:51:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:52:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:52:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:53:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:53:58,103 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:54:24,478 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:54:58,104 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:55:24,478 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:55:58,104 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:56:24,478 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
> 2009-08-23 02:56:58,104 INFO org.apache.hadoop.hbase.master.ServerManager:
> 0 region servers, 2 dead, average load NaN[server,60020,1250975507256,
> server2,600$
> 2009-08-23 02:57:24,479 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All 0 .META. region(s) scanned
>
>
>
>
> On Sun, Aug 23, 2009 at 3:07 AM, Andrew Purtell <apurtell@apache.org>wrote:
>
>> See http://issues.apache.org/jira/browse/HDFS-200. I think support for
>> what HBase needs from HDFS is almost there. Alternatively you could consider
>> a different type of underlying filesystem -- Lustre, Gluster, etc.
>>
>> However, you say that HBase crashes for a very small table. I wonder why
>> that is. For example I have a table with more than 1TB of data and hundreds
>> of regions over 4 servers only and HBase is stable. What hardware resources
>> are you running HBase on (CPU, RAM, disk, number of servers, etc.)? What
>> other processes are you running on those servers? What are your observations
>> around the time of the crash? For example, have you noticed exceptions in
>> the HBase or DFS logs? I think getting to the bottom of this would help you
>> more than explicit flushing as a workaround.
>>
>>   - Andy
>>
>> 2009/8/23 Lucas Nazário dos Santos <nazario.lucas@gmail.com>
>>
>> > Hi,
>> >
>> > I have a very small table under HBase that I use to store control data
>> of
>> > my program. If HBase crashes and I have to kill it, the table goes away.
>> > However, I can still see the table inside HDFS.
>> >
>> > What is more interesting is that if I stop HBase properly, the table
>> seems
>> > to be persisted and I don't lose it anymore as consequence of a crash.
>> >
>> > Because the table is very small, I think it's not being flushed to HDFS
>> > (the table itself or/and the meta info). I try to flush everything with
>> > HTable#flushCommits and HBaseAdmin#flush with no success. Does anybody
>> has
>> > already gone through it? How can a flush EVERYTHING to HDFS so I won't
>> lose
>> > data as consequence of a kill -9? Any special configuration inside
>> > hbase-site.xml?
>> >
>> > I'm using HBase 0.20.0 RC2 together with Hadoop 0.20.0.
>> >
>> > Thanks,
>> > Lucas
>> >
>>
>>
>>
>> --
>> Best Regards,
>> Chen Xinli
>>
>>
>>
>>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message