hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: HBase PerformanceEvaluation failing
Date Fri, 16 Nov 2007 01:01:20 GMT
Your DFS is healthy?  This seems odd: "File 
/tmp/hadoop-kcd/hbase/hregion_TestTable,2102165,6843477525281170954/info/mapfiles/6464987859396543981/datacould

only be replicated to 0 nodes, instead of 1;"  In my experience, IIRC, 
it means no datanodes running.

(I just tried the PE from TRUNK and it ran to completion).

St.Ack

Kareem Dana wrote:
> I'm trying to run the HBase PerformanceEvaluation program on a cluster
> of 5 hadoop nodes (on virtual machines).
>
> hadoop07 is a DFS Master and HBase master
> hadoop08-12 are HBase region servers
>
> I start the test as follows:
>
> $ bin/hadoop jar
> ${HADOOP_HOME}build/contrib/hbase/hadoop-0.15.0-dev-hbase-test.jar
> sequentialWrite 2
>
> This starts the sequentialWrite test with 2 clients. After about 25
> minutes the map tasks are about 25% complete and reduce at 6% the test
> fails with the following error:
> 2007-11-15 17:06:35,100 INFO org.apache.hadoop.mapred.TaskInProgress:
> TaskInProgress tip_200711151626_0001_m_000002 has failed 1 times.
> 2007-11-15 17:06:35,100 INFO org.apache.hadoop.mapred.JobInProgress:
> Aborting job job_200711151626_0001
> 2007-11-15 17:06:35,101 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error from task_200711151626_0001_m_000006_0:
> org.apache.hadoop.hbase.NoServerForRegionException: failed to find
> server for TestTable after 5 retries
> 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.scanOneMetaRegion(HConnectionManager.java:761)
> 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.findServersForTable(HConnectionManager.java:521)
> 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.reloadTableServers(HConnectionManager.java:317)
> 	at org.apache.hadoop.hbase.HTable.commit(HTable.java:671)
> 	at org.apache.hadoop.hbase.HTable.commit(HTable.java:636)
> 	at org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest.testRow(PerformanceEvaluation.java:493)
> 	at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:356)
> 	at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:529)
> 	at org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:184)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> 	
>
> An HBase region server log shows these errors:
> 2007-11-15 17:03:00,017 ERROR org.apache.hadoop.hbase.HRegionServer:
> error closing region TestTable,2102165,6843477525281170954
> org.apache.hadoop.hbase.DroppedSnapshotException: java.io.IOException:
> File /tmp/hadoop-kcd/hbase/hregion_TestTable,2102165,6843477525281170954/info/mapfiles/6464987859396543981/data
> could only be replicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1003)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>
>         at org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:886)
>         at org.apache.hadoop.hbase.HRegion.close(HRegion.java:388)
>         at org.apache.hadoop.hbase.HRegionServer.closeAllRegions(HRegionServer.java:978)
>         at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:593)
>         at java.lang.Thread.run(Thread.java:595)
> 2007-11-15 17:03:00,615 ERROR org.apache.hadoop.hbase.HRegionServer:
> error closing region TestTable,3147654,8929124532081908894
> org.apache.hadoop.hbase.DroppedSnapshotException: java.io.IOException:
> File /tmp/hadoop-kcd/hbase/hregion_TestTable,3147654,8929124532081908894/info/mapfiles/3451857497397493742/data
> could only be replicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1003)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>
>         at org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:886)
>         at org.apache.hadoop.hbase.HRegion.close(HRegion.java:388)
>         at org.apache.hadoop.hbase.HRegionServer.closeAllRegions(HRegionServer.java:978)
>         at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:593)
>         at java.lang.Thread.run(Thread.java:595)
> 2007-11-15 17:03:00,639 ERROR org.apache.hadoop.hbase.HRegionServer:
> Close and delete failed
> java.io.IOException: java.io.IOException: File
> /tmp/hadoop-kcd/hbase/log_172.16.6.57_-3889232888673408171_60020/hlog.dat.005
> could only be replicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1003)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
>         at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)
>         at org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:48)
>         at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:597)
>         at java.lang.Thread.run(Thread.java:595)
> 2007-11-15 17:03:00,640 INFO org.apache.hadoop.hbase.HRegionServer:
> telling master that region server is shutting down at:
> 172.16.6.57:60020
> 2007-11-15 17:03:00,643 INFO org.apache.hadoop.hbase.HRegionServer:
> stopping server at: 172.16.6.57:60020
> 2007-11-15 17:03:00,643 INFO org.apache.hadoop.hbase.HRegionServer:
> regionserver/0.0.0.0:60020 exiting
>
> I can provide some more logs if necessary. Any ideas or suggestions
> about how I track this down? Running sequentialWrite test with just 1
> client works fine but using 2 or more causes these errors.
>
> Thanks for any help,
> Kareem Dana
>   


Mime
View raw message