phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Samarth Jain (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4110) ParallelRunListener should monitor number of tables and not number of tests
Date Wed, 23 Aug 2017 16:45:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138621#comment-16138621
] 

Samarth Jain commented on PHOENIX-4110:
---------------------------------------

Looks like this change didn't help. I ran the suite locally and monitored the java heap of
the forked processes. And I saw that even though we are shutting down the mini-cluster more
often, the heap memory keeps growing as tests progress. So I took a heap dump of one of the
JVMs and ran a profiler. I saw that instances of three objects - MetricsSystemImpl, HRegion
and Configuration are occupying most of the memory (93%)

{code}
One instance of "org.apache.hadoop.metrics2.impl.MetricsSystemImpl" loaded by "sun.misc.Launcher$AppClassLoader
@ 0x75a025670" occupies 201,306,192 (10.95%) bytes.

717 instances of "org.apache.hadoop.hbase.regionserver.HRegion", loaded by "sun.misc.Launcher$AppClassLoader
@ 0x75a025670" occupy 1,218,750,256 (66.30%) bytes. 

2,040 instances of "org.apache.hadoop.conf.Configuration", loaded by "sun.misc.Launcher$AppClassLoader
@ 0x75a025670" occupy 287,352,096 (15.63%) bytes. 
{code}

- MetricsSystemImpl is a singleton i.e. supposed to be created once. It doesn't get shutdown
when the mini cluster is shutdown. An option would be for us to shut it down ourselves when
we are shutting down the mini cluster.

- The bulk of the heap is occupied by HRegion objects. It looks like in certain cases when
region server is being stopped, not all the regions are getting closed. On inspecting the
path of strong references to HRegion, it seems to be coming from thread objects of the class
JVMClusterUtil$RegionServerThread. Looking at the hbase code I see that that when region server
starts, it registers it's thread to the jvm's shutdown hook mechanism. This reference sticks
around even though the thread itself has terminated. So when the regions are not closed, this
thread object keeps the HRegions in memory resulting in memory leak. I will file an HBase
JIRA for this.

Note, this was for 0.98. I need to try it out with 1.3 also. Worst case, I think we may have
to resort to halting the JVM after every test. Or maybe come up with a mechanism (with some
help of surefire plugin) to do the JVM halt after every few runs. Or maybe just call System.gc()
and hope for the best :)

Will keep digging.

> ParallelRunListener should monitor number of tables and not number of tests
> ---------------------------------------------------------------------------
>
>                 Key: PHOENIX-4110
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4110
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>            Assignee: Samarth Jain
>         Attachments: PHOENIX-4110.patch
>
>
> ParallelRunListener today monitors the number of tests that have been run to determine
when mini cluster should be shut down. This helps prevent our test JVM forks running in OOM.
A better heuristic would be to instead check the number of tables that were created by tests.
This way when a particular test class has created lots of tables, we can shut down the mini
cluster sooner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message