hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kareem Dana" <kareem.d...@gmail.com>
Subject HBase PerformanceEvaluation failing
Date Thu, 15 Nov 2007 23:30:28 GMT
I'm trying to run the HBase PerformanceEvaluation program on a cluster
of 5 hadoop nodes (on virtual machines).

hadoop07 is a DFS Master and HBase master
hadoop08-12 are HBase region servers

I start the test as follows:

$ bin/hadoop jar
${HADOOP_HOME}build/contrib/hbase/hadoop-0.15.0-dev-hbase-test.jar
sequentialWrite 2

This starts the sequentialWrite test with 2 clients. After about 25
minutes the map tasks are about 25% complete and reduce at 6% the test
fails with the following error:
2007-11-15 17:06:35,100 INFO org.apache.hadoop.mapred.TaskInProgress:
TaskInProgress tip_200711151626_0001_m_000002 has failed 1 times.
2007-11-15 17:06:35,100 INFO org.apache.hadoop.mapred.JobInProgress:
Aborting job job_200711151626_0001
2007-11-15 17:06:35,101 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from task_200711151626_0001_m_000006_0:
org.apache.hadoop.hbase.NoServerForRegionException: failed to find
server for TestTable after 5 retries
	at org.apache.hadoop.hbase.HConnectionManager$TableServers.scanOneMetaRegion(HConnectionManager.java:761)
	at org.apache.hadoop.hbase.HConnectionManager$TableServers.findServersForTable(HConnectionManager.java:521)
	at org.apache.hadoop.hbase.HConnectionManager$TableServers.reloadTableServers(HConnectionManager.java:317)
	at org.apache.hadoop.hbase.HTable.commit(HTable.java:671)
	at org.apache.hadoop.hbase.HTable.commit(HTable.java:636)
	at org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest.testRow(PerformanceEvaluation.java:493)
	at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:356)
	at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:529)
	at org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:184)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
	

An HBase region server log shows these errors:
2007-11-15 17:03:00,017 ERROR org.apache.hadoop.hbase.HRegionServer:
error closing region TestTable,2102165,6843477525281170954
org.apache.hadoop.hbase.DroppedSnapshotException: java.io.IOException:
File /tmp/hadoop-kcd/hbase/hregion_TestTable,2102165,6843477525281170954/info/mapfiles/6464987859396543981/data
could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1003)
        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
        at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

        at org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:886)
        at org.apache.hadoop.hbase.HRegion.close(HRegion.java:388)
        at org.apache.hadoop.hbase.HRegionServer.closeAllRegions(HRegionServer.java:978)
        at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:593)
        at java.lang.Thread.run(Thread.java:595)
2007-11-15 17:03:00,615 ERROR org.apache.hadoop.hbase.HRegionServer:
error closing region TestTable,3147654,8929124532081908894
org.apache.hadoop.hbase.DroppedSnapshotException: java.io.IOException:
File /tmp/hadoop-kcd/hbase/hregion_TestTable,3147654,8929124532081908894/info/mapfiles/3451857497397493742/data
could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1003)
        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
        at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

        at org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:886)
        at org.apache.hadoop.hbase.HRegion.close(HRegion.java:388)
        at org.apache.hadoop.hbase.HRegionServer.closeAllRegions(HRegionServer.java:978)
        at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:593)
        at java.lang.Thread.run(Thread.java:595)
2007-11-15 17:03:00,639 ERROR org.apache.hadoop.hbase.HRegionServer:
Close and delete failed
java.io.IOException: java.io.IOException: File
/tmp/hadoop-kcd/hbase/log_172.16.6.57_-3889232888673408171_60020/hlog.dat.005
could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1003)
        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
        at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
        at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)
        at org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:48)
        at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:597)
        at java.lang.Thread.run(Thread.java:595)
2007-11-15 17:03:00,640 INFO org.apache.hadoop.hbase.HRegionServer:
telling master that region server is shutting down at:
172.16.6.57:60020
2007-11-15 17:03:00,643 INFO org.apache.hadoop.hbase.HRegionServer:
stopping server at: 172.16.6.57:60020
2007-11-15 17:03:00,643 INFO org.apache.hadoop.hbase.HRegionServer:
regionserver/0.0.0.0:60020 exiting

I can provide some more logs if necessary. Any ideas or suggestions
about how I track this down? Running sequentialWrite test with just 1
client works fine but using 2 or more causes these errors.

Thanks for any help,
Kareem Dana

Mime
View raw message