groovy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain Stalder <astal...@span.ch>
Subject Re: Improve Groovy class loading performance and memory management
Date Sat, 28 May 2016 17:15:10 GMT
Hmn, not sure yet, but looks like the map from the Spring Framework I am 
using is treating both keys (Class) and values (ClassInfo) as weak 
references, not sure yet if this could easily be changed...
Too bad, this time I really thought I had done enough tests before 
posting...

On 28.05.16 16:49, Alain Stalder wrote:
> This is going to be a *very* long mail, but I think it is probably 
> worth it! :)
>
> First of all, although I am not 100% sure, I think I agree with Jochen 
> regarding ClassValue - in any case, I find ClassValue is not a viable 
> option to count on in the immediately forseeable future.
>
> Instead I wrote a PoC based on the Groovy (2.5.0) master with the 
> following highlights:
>
> - In most use cases, classes that are no longer used become 
> immediately available for garbage collection.
> - In all cases, garbage collection is possible once the limit on 
> Metaspace/PermGen resp. Heap is reached, i.e. no more OutOfMemoryErrors.
> - Appears in some quick initial tests to be generally even a bit 
> *faster*(!) than the current implementation.
> - (Not using ClassValue at all.)
> - (The two merge requests by John Wagenleitner for GROOVY-7683 (weak 
> reference to Class in ClassInfo) and GROOVY-7646 (explicit cleanup 
> after running scripts in GroovyShell) would become obsolete.)
>
> Let me first define two things:
>
> I will call a class "weakly-collectable" if it can be collected while 
> the VM is running normally, i.e. before any limit on PermGen/Metaspace 
> or Heap is reached, and "softly-collectable" if that only happens when 
> such a limit is reached, but is still possible then, i.e. no 
> OutOfMemoryError.
>
> I will call the (maybe most typical?) use case where a Java VM 
> dynamically compiles and runs some Groovy scripts the 
> "script-running-use-case", including generally also the case were 
> scripts were precompiled and are loaded by a dedicated class loader 
> (i.e. not the same class loader as Groovy itself), and I will call the 
> use case where both Groovy and compiled scripts are loaded by the same 
> classe loader the "gradle-use-case", like when the Gradle daemon keeps 
> running and reloads Groovy and build scripts (as I understand how this 
> works - correct me if I got that wrong).
>
> The status quo with Groovy 2.4.6. is as follows:
>
> - script-running-use-case, use ClassValue: softly-collectable
> - script-running-use-case, don't user ClassValue: not collectable 
> (OutOfMemoryError)
> - gradle-use-case, use ClassValue: not collectable (OutOfMemoryError)
> - gradle-use-case, don't user ClassValue: softly-collectable
>
> Now for the PoC...
>
> Here are the PoC branch and diff to master:
> - https://github.com/jexler/groovy/tree/weak-gc-poc
> - https://github.com/jexler/groovy/compare/master...jexler:weak-gc-poc
>
> The core new thing is the class 
> org.codehaus.groovy.reflection.ClassInfoMap, which is based on 
> ConcurrentReferenceHashMap from the spring framework (which in turn 
> appears to have originated from JBoss). It implements basically a 
> WeakHashMap with thread-safe read/write access.
>
> In ClassInfo, that new ClassInfoMap is used within GlobalClassSet. 
> (Detail: I have left the ManagedLinkedList<ClassInfo> items in the 
> GlobalClassSet class because at least some Gradle versions seem to 
> access it directly via reflection.)
>
> GroovyClassValue (both the real one based on ClassValue and the pre 
> Java 7 emulation based on ManagedConcurrentMap) is not used at all any 
> more.
>
> The other "half" of the PoC concerns the java.beans.Introspector, 
> because its caches are now the last thing that prevents 
> weakly-collecting unused classes (as I will show a bit later on).
>
> The basic approach here is to cache BeanInfo as a new private member 
> "beanInfo" of ClassInfo and to remove it immediately after creation 
> from Introspector caches. There is also a new public getter 
> classInfo.getBeanInfo() that lazily initializes BeanInfo and returns it.
>
> I provide 4 options for this PoC how to clean up Introspector caches, 
> via a system property "weak-gc-poc.cleanup":
>
> - "none": No cleanup, as today
> - "class": The default, call Introspector.flushFromCaches(theClass) 
> after getting beanInfo and storing it in ClassInfo
> - "super": Same as class, but do the cleanup for the class and all of 
> its superclasses (except java.* and javax.*)
> - "all": Clean Introspector caches for all classes, i.e. call 
> Introspector.flushCaches()
>
> In the end I suspect only "none" and "class" would be viable options 
> because the others probably have too much impact on performance (more 
> creations of BeanInfo for same classes), potentially also influencing 
> performance of outside code that is also using Introspector.
>
> First some results based on classgctest ( 
> https://github.com/jexler/classgc ).
>
> script-running-use-case, with the default "weak-gc-poc.cleanup" 
> setting of "class":
>
> $ java -XX:MaxMetaspaceSize=256m -Xmx512m -cp 
> .:groovy-2.5.0-weak-gc-poc.jar ClassGCTester -cp filling/ -parent 
> tester -classes GroovyFilling
>
> Secs Test classes              Metaspace/PermGen Heap   Load time 
> Create time
>        #loaded  #remaining        used committed       used 
> committed     average     average
>    0         1           1       6.4m       6.5m      14.1m 245.5m     
> 1.226ms    11.831ms
>    1       482         482       9.1m      10.5m      25.9m 245.5m     
> 0.343ms     1.650ms
>    2      1356        1356      12.5m      15.8m      63.1m 245.5m     
> 0.265ms     1.167ms
>    3      2398         137       7.9m      16.8m      19.7m 224.0m     
> 0.243ms     0.977ms
>    4      3475        1214      12.0m      16.8m      20.5m 239.5m     
> 0.223ms     0.902ms
>
> So, weakly-collectable, what we want.
>
> gradle-use-case, first with "class":
>
> $ java -XX:MaxMetaspaceSize=256m -Xmx512m -cp . ClassGCTester -cp 
> groovy-2.5.0-weak-gc-poc.jar:filling/ -parent null -classes GroovyFilling
>
> Secs Test classes              Metaspace/PermGen Heap   Load time 
> Create time
>        #loaded  #remaining        used committed       used 
> committed     average     average
>    0         1           1       8.1m       8.5m      17.9m 245.5m     
> 2.249ms   131.702ms
>    1         9           9      22.9m      23.9m      23.7m 240.0m     
> 1.728ms   115.582ms
>    2        18          18      39.0m      40.6m      47.8m 300.5m     
> 1.450ms   112.826ms
>    3        26          26      53.3m      55.1m     108.7m 300.5m     
> 1.456ms   113.726ms
>    4        36          36      71.1m      73.6m     103.8m 396.0m     
> 1.372ms   110.934ms
>    5        46          46      88.9m      92.1m     180.5m 396.0m     
> 1.335ms   107.233ms
>    6        56          56     106.7m     110.3m      99.0m 414.0m     
> 1.308ms   107.037ms
>    7        66          66     124.5m     128.8m     109.3m 443.5m     
> 1.267ms   104.878ms
>    8        77          77     144.1m     148.9m     111.6m 437.0m     
> 1.229ms   103.268ms
>    9        86          86     160.1m     165.4m     102.0m 467.0m     
> 1.206ms   103.848ms
>   10        96          96     177.9m     183.9m     115.8m 465.0m     
> 1.188ms   102.931ms
>   11       107         107     197.5m     204.0m     128.3m 450.5m     
> 1.166ms   102.170ms
>   12       117         117     215.3m     222.4m     132.6m 459.5m     
> 1.149ms   101.614ms
>   13       127         127     233.1m     240.9m     142.7m 458.5m     
> 1.142ms   101.311ms
>   14       136           3      10.9m      60.0m      17.5m 450.0m     
> 1.135ms   103.695ms
>
> So, softly-collectable, which is because the Introspector keeps 
> BeanInfo for superclasses, as becomes evident when explicitly running 
> the test again with "super":
>
> $ java -XX:MaxMetaspaceSize=256m -Xmx512m -Dweak-gc-poc.cleanup=super 
> -cp . ClassGCTester -cp groovy-2.5.0-weak-gc-poc.jar:filling/ -parent 
> null -classes GroovyFilling
>
> Secs Test classes              Metaspace/PermGen Heap   Load time 
> Create time
>        #loaded  #remaining        used committed       used 
> committed     average     average
>    0         1           1       8.1m       8.5m      17.9m 245.5m     
> 2.307ms   125.460ms
>    1         9           3      10.6m      16.8m      18.3m 233.0m     
> 1.668ms   114.096ms
>    2        19          12      26.6m      27.9m      29.4m 295.5m     
> 1.661ms   111.405ms
>    3        27          20      40.9m      42.3m      90.3m 295.5m     
> 1.729ms   111.737ms
>    4        37          19      39.2m      41.3m      81.7m 358.5m     
> 1.658ms   107.926ms
>    5        47          29      57.0m      59.0m     156.8m 358.5m     
> 1.632ms   104.915ms
>    6        57           8      19.7m      29.5m      47.4m 344.0m     
> 2.088ms   103.593ms
>    7        68          19      39.2m      43.5m      62.7m 372.5m     
> 1.986ms   101.548ms
>
> So, weakly-collectable in this case.
>
> Let me first present some quick results regarding performance before 
> discussing where maybe to take this...
>
> First test script, script0.groovy:
> -- 
> def shell = new GroovyShell()
> for (int i=0; i<1000; i++) {
>    long t0 = System.nanoTime()
>    for (int j=0; j<1000; j++) {
>       shell.run("return $i+$j", "script", [])
>    }
>    long t1 = System.nanoTime()
>    printf("%3d: %3.1fs%n", i, ((double)(t1-t0))/1000000000)
> }
> -- 
>
> $ groovyc script0.groovy
>
> Then running it first with 2.5.0-SNAPSHOT (current master):
>
> $ java -cp 
> .:/Users/alain/tech/unix/groovy-2.5.0-SNAPSHOT/lib/groovy-2.5.0-SNAPSHOT.jar 
> script0
>   0: 3.6s
>   1: 2.7s
>   2: 2.1s
>   3: 2.4s
>   4: 1.7s
>   5: 1.8s
>   6: 1.6s
>   7: 2.7s
>   8: 1.6s
>   9: 2.0s
>
> And then with the PoC (default "class"):
>
> $ java -cp 
> .:/Users/alain/tech/unix/groovy-2.5.0-weak-gc-poc/lib/groovy-2.5.0-weak-gc-poc.jar 
> script0
>   0: 3.5s
>   1: 2.3s
>   2: 1.8s
>   3: 1.7s
>   4: 1.6s
>   5: 1.4s
>   6: 1.2s
>   7: 1.2s
>   8: 1.3s
>   9: 1.3s
>
> So, very similar performance, the PoC appears even slightly faster and 
> with the PoC classes were weakly collectable, as expected.
>
> Second test script, script1.groovy (rather ugly, but works):
> -- 
>
> def scriptText = """
> class Script1 extends Script {
>    static class Inner {
>        int x = 1;
>    }
>
>    Object run() {
>        int x = new Inner().x
>        int y = new Parallel().y
>        return x+y
>    }
> }
>
> class Parallel {
>     int y = 2;
> }
> """
>
> def shell = new GroovyShell()
> for (int i=0; i<1000; i++) {
>    long t0 = System.nanoTime()
>    for (int j=0; j<1000; j++) {
>       shell.run(scriptText, "script", [])
>    }
>    long t1 = System.nanoTime()
>    printf("%3d: %3.1fs%n", i, ((double)(t1-t0))/1000000000)
> }
> -- 
>
> Output with 2.5.0 master:
>
> $ java -cp 
> .:/Users/alain/tech/unix/groovy-2.5.0-SNAPSHOT/lib/groovy-2.5.0-SNAPSHOT.jar 
> script1
>   0: 8.6s
>   1: 6.5s
>   2: 5.8s
>   3: 4.7s
>   4: 5.9s
>   5: 5.0s
>   6: 5.2s
>   7: 5.3s
>   8: 6.8s
>   9: 5.4s
>
> Output with Poc (default "class"):
>
> $ java -cp 
> .:/Users/alain/tech/unix/groovy-2.5.0-weak-gc-poc/lib/groovy-2.5.0-weak-gc-poc.jar 
> script1
>   0: 8.2s
>   1: 5.9s
>   2: 4.5s
>   3: 4.0s
>   4: 4.0s
>   5: 3.7s
>   6: 3.8s
>   7: 3.6s
>   8: 3.6s
>   9: 3.5s
>
> This time the PoC appears to be really faster and YES, classes were 
> also weakly-collectable in this case, even though several Groovy 
> classes were generated by compiling the script text Script1.
>
> The third test script introduced a little bit of concurrency, although 
> very likely not enough to really stress things, script2.groovy:
> -- 
> def scriptText = """
> class Script1 extends Script {
>    static class Inner {
>        int x = 1;
>    }
>
>    Object run() {
>        int x = new Inner().x
>        int y = new Parallel().y
>        return x+y
>    }
> }
>
> class Parallel {
>     int y = 2;
> }
> """
>
> def shells = new GroovyShell[10]
> for (int t=0; t<10; t++) {
>   shells[t] = new GroovyShell()
> }
> for (int i=0; i<1000; i++) {
>    long t0 = System.nanoTime()
>    def threads = new Thread[10]
>    for (int t=0; t<10; t++) {
>       final int n = t
>       threads[n] = Thread.start {
>          for (int j=0; j<100; j++) {
>             shells[n].run(scriptText, "script", [])
>          }
>       }
>    }
>    for (int t=0; t<10; t++) {
>       threads[t].join()
>    }
>    long t1 = System.nanoTime()
>    printf("%3d: %3.1fs%n", i, ((double)(t1-t0))/1000000000)
> }
> -- 
>
> Output with 2.5.0 master:
>
> $ java -cp 
> .:/Users/alain/tech/unix/groovy-2.5.0-SNAPSHOT/lib/groovy-2.5.0-SNAPSHOT.jar 
> script2
>   0: 3.4s
>   1: 2.7s
>   2: 3.0s
>   3: 2.1s
>   4: 3.4s
>   5: 2.5s
>   6: 2.7s
>   7: 2.8s
>   8: 4.4s
>   9: 2.3s
>
> Output with Poc (default "class"):
>
> $ java -cp 
> .:/Users/alain/tech/unix/groovy-2.5.0-weak-gc-poc/lib/groovy-2.5.0-weak-gc-poc.jar 
> script2
>   0: 2.8s
>   1: 2.0s
>   2: 1.7s
>   3: 1.7s
>   4: 1.6s
>   5: 1.6s
>   6: 1.8s
>   7: 1.3s
>   8: 1.7s
>   9: 1.7s
>
> Once more the PoC appears to be faster and again classes were 
> weakly-collectable.
>
> I have also run the Gradle build of grengine with 2.5.0 master and the 
> PoC. The build contains 6 unit tests that load lots of Groovy classes 
> in separate threads, but compile little. There the PoC appaered to be 
> slightly slower (0-5%) than 2.5.0 master.
>
> What next?
>
> Maybe you first want to take a look yourself?
>
> A distribution based on the PoC is available at 
> https://www.jexler.net/apache-groovy-binary-2.5.0-weak-gc-poc.zip
>
> If there was a consensus to continue to evaluate this approach:
>
> Regarding the map taken from Spring Framework:
> - Is it really thread-safe? (Naively one would assume so because part 
> of a widely-used framework, but assumption is the mother of all ...)
> - Does it ever perform serverely less than the current implementation?
> - Do classes also remain weakly-collectable if more complex things are 
> done (and stored in ClassInfo)?
> - OK to use the code? I presume yes, is also Apache 2.0; maybe 
> necessary to mention it in some other places outside of the code?
>
> Regarding Introspector:
> - Go with "weak-gc-poc.cleanup" always "class", i.e. remove the system 
> property completely?
> - For Gradle, recommend to call Introspector.flushCaches() after each 
> build to make classes immediately available for garbage collection?
> - Similar recommendation for similar use cases like e.g. Groovy in a 
> webapp container?
> - Later migrate completely away from using Introspector to solve this 
> completely.
>
> But is this the way to go, and for which version?
>
> You have to help me out here:
> - Could this be a candidate for a 2.4.7, even though ClassInfo would 
> get a new public method (getBeanInfo)?
> - Does this sound interesting enough to do more at the moment?
> - If yes, who could maybe test a few more things, maybe with Gradle or 
> Grails etc.? (Or does that usually only happen in a beta?)
>
> Please tell me if there is anything more I can do here to help out...
>
> Alain
>
> .
>


Mime
View raw message