hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14819) hbase-it tests failing with OOME
Date Thu, 19 Nov 2015 01:53:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012594#comment-15012594
] 

stack commented on HBASE-14819:
-------------------------------

bq. This is a failure to allocate PermGen, up it with -XX:MaxPermSize=XXXm, e.g. 512. Only
works with Java <= 7. Java 8 will accept the parameter but throw a warning at JVM startup
(PermGen went away in 8)

Yeah. I was thinking I could up the heap and then perm gen would up accordingly but not sure
what heap size we are running with... Some percentage of total heap but maybe varies on build
machines... need to figure whats going on here. Was going to set a heap size when we fork
but that might not be right thing to do (4G seems good).

Then tried running these locally so could study. First one fails with the above 2k + threads
(which is crazy). Let me add a 'fix' that makes it so we ONLY use 1500 threads in the test....


The other ITs just fail for all kinds of reasons. TODO.

> hbase-it tests failing with OOME
> --------------------------------
>
>                 Key: HBASE-14819
>                 URL: https://issues.apache.org/jira/browse/HBASE-14819
>             Project: HBase
>          Issue Type: Sub-task
>          Components: test
>            Reporter: stack
>         Attachments: Screen Shot 2015-11-16 at 11.37.41 PM.png
>
>
> Let me up the heap used when failsafe forks.
> Here is example OOME doing ITBLL:
> {code}
> 2015-11-16 03:09:15,073 INFO  [Thread-694] actions.BatchRestartRsAction(69): Starting
region server:asf905.gq1.ygridcore.net
> 2015-11-16 03:09:15,099 INFO  [Thread-694] client.ConnectionUtils(104): regionserver/asf905.gq1.ygridcore.net/67.195.81.149:0
server-side HConnection retries=350
> 2015-11-16 03:09:15,099 INFO  [Thread-694] ipc.SimpleRpcScheduler(128): Using deadline
as user call queue, count=1
> 2015-11-16 03:09:15,101 INFO  [Thread-694] ipc.RpcServer$Listener(607): regionserver/asf905.gq1.ygridcore.net/67.195.81.149:0:
started 3 reader(s) listening on port=36114
> 2015-11-16 03:09:15,103 INFO  [Thread-694] fs.HFileSystem(252): Added intercepting call
to namenode#getBlockLocations so can do block reordering using class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks
> 2015-11-16 03:09:15,104 INFO  [Thread-694] zookeeper.RecoverableZooKeeper(120): Process
identifier=regionserver:36114 connecting to ZooKeeper ensemble=localhost:50139
> 2015-11-16 03:09:15,117 DEBUG [Thread-694-EventThread] zookeeper.ZooKeeperWatcher(554):
regionserver:361140x0, quorum=localhost:50139, baseZNode=/hbase Received ZooKeeper Event,
type=None, state=SyncConnected, path=null
> 2015-11-16 03:09:15,118 DEBUG [Thread-694] zookeeper.ZKUtil(492): regionserver:361140x0,
quorum=localhost:50139, baseZNode=/hbase Set watcher on existing znode=/hbase/master
> 2015-11-16 03:09:15,119 DEBUG [Thread-694] zookeeper.ZKUtil(492): regionserver:361140x0,
quorum=localhost:50139, baseZNode=/hbase Set watcher on existing znode=/hbase/running
> 2015-11-16 03:09:15,119 DEBUG [Thread-694-EventThread] zookeeper.ZooKeeperWatcher(638):
regionserver:36114-0x1510e2c6f1d0029 connected
> 2015-11-16 03:09:15,120 INFO  [RpcServer.responder] ipc.RpcServer$Responder(926): RpcServer.responder:
starting
> 2015-11-16 03:09:15,121 INFO  [RpcServer.listener,port=36114] ipc.RpcServer$Listener(738):
RpcServer.listener,port=36114: starting
> 2015-11-16 03:09:15,121 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default Start Handler
index=0 queue=0
> 2015-11-16 03:09:15,121 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default Start Handler
index=1 queue=0
> 2015-11-16 03:09:15,121 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default Start Handler
index=2 queue=0
> 2015-11-16 03:09:15,122 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default Start Handler
index=3 queue=0
> 2015-11-16 03:09:15,122 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default Start Handler
index=4 queue=0
> 2015-11-16 03:09:15,122 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority Start Handler
index=0 queue=0
> 2015-11-16 03:09:15,123 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority Start Handler
index=1 queue=1
> 2015-11-16 03:09:15,123 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority Start Handler
index=2 queue=0
> 2015-11-16 03:09:15,123 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority Start Handler
index=3 queue=1
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority Start Handler
index=4 queue=0
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Replication Start Handler
index=0 queue=0
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Replication Start Handler
index=1 queue=0
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Replication Start Handler
index=2 queue=0
> 2015-11-16 03:09:15,761 DEBUG [RS:0;asf905:36114] client.ConnectionManager$HConnectionImplementation(715):
connection construction failed
> java.io.IOException: java.lang.OutOfMemoryError: PermGen space
> 	at org.apache.hadoop.hbase.client.RegistryFactory.getRegistry(RegistryFactory.java:43)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.setupRegistry(ConnectionManager.java:886)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:692)
> 	at org.apache.hadoop.hbase.client.ConnectionUtils$2.<init>(ConnectionUtils.java:154)
> 	at org.apache.hadoop.hbase.client.ConnectionUtils.createShortCircuitConnection(ConnectionUtils.java:154)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.createClusterConnection(HRegionServer.java:689)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.setupClusterConnection(HRegionServer.java:720)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:733)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:889)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:356)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
> 	at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.OutOfMemoryError: PermGen space
> 	at sun.misc.Unsafe.defineClass(Native Method)
> 	at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
> 	at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399)
> 	at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395)
> 	at sun.reflect.MethodAccessorGenerator.generateConstructor(MethodAccessorGenerator.java:94)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:48)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 	at java.lang.Class.newInstance(Class.java:383)
> 	at org.apache.hadoop.hbase.client.RegistryFactory.getRegistry(RegistryFactory.java:41)
> 	... 17 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message