groovy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain Stalder <astal...@span.ch>
Subject Re: Improve Groovy class loading performance and memory management
Date Mon, 16 May 2016 20:34:16 GMT
Thanks, I had not looked at the master branch, ClassInfo source looks 
quite a bit cleaner there already :)

Regarding programmatic cleanup (GROOVY-7646), I think that is a good 
idea, but in the details there might be some obstacles.

For example this sequence of calls to GroovyShell:

def shell = new GroovyShell()
def script1 = shell.parse("42")
assert script1.class.name == "Script1"
def script2 = shell.parse("new Script1().run()")
assert script2.run() == 42

def script3 = shell.parse("99", "Nintetynine")
assert script3.class.name == "Nintetynine"

def file = new File("Fiftyfive.groovy")
file.setText("55")
def script4 = shell.parse(file)
assert script4.class.name == "Fiftyfive"

So, classes accumulate (in the GroovyClassLoader) and can be addressed 
by their names in subsequent scripts. (And for more complex script 
expressions, more than one class might be the result of compilation, 
e.g. with closures or inner classes, enums, etc.)

I would estimate that in the case where a script is run with a name 
given automatically by the GroovyShell ("Script1", "Script2", ...) it 
would be OK to do the cleanup (and I guess using GroovyShell that way 
might be a very common case?), but when it comes to explicitly named 
scripts, doing so might change behaviour of existing code.

I just took a look at GroovyScriptEngine which also has run() methods 
and, if I remember correctly, it recompiles all scripts if one of them 
changes (to get dependencies right), so in principle lots of classes to 
cleanup for each time this happens. But I am not sure if that is 
possible there, because there is also a createScript() method, so 
possibly still objects/classes that are in use around.

(And I have also just started to think about Grengine in that context, 
my open source library for using Groovy in a Java VM (and which almost 
nobody uses ;), there it might be easier to build in such automatic 
removal because the approach is more structured, although a bit less 
dynamic.)

Hmn, would really be great if there was a way to achieve constant 
garbage collection of Groovy classes.

Alain

On 16.05.16 18:18, John Wagenleitner wrote:
> Just catching up on this thread, very interesting discussion and will 
> have to give the posted test code a try.
>
> You are right about the PhantomReference and it has been removed in 
> master [1] along with the local cache that used it.  Due to some 
> refactorings that were not in 2_4_X at the time it wasn't removed from 
> that branch. But probably should be cleaned up if any fixes for the 
> memory issues to ClassInfo are merged into that branch in the near future.
>
> I think the suggestion of referencing the Class via the ClassInfo from 
> metaclasses/cachedclasses would be a good one, the less places the 
> Class is kept the better.  Unfortunately since it's a protected field 
> on MetaClassImpl that would be a breaking change would be something 
> for a 3.0 as you pointed out.
>
> So far, I have found that keeping a WeakReference to the Class in 
> ClassInfo allows it to be collected (mostly tested with non-ClassValue 
> version of ClassInfo).  At least one exception is if methods are added 
> to the metaclass then it's required to setMetaClass(null) since the 
> ExpandoMetaClass is a strong reference on ClassInfo and EMC has a 
> strong reference to the Class.  What is difficult to determine is if 
> keeping a WeakReference can cause any potential issues.  Only possible 
> problem I can see is if the methods of the Class A were referenced in 
> the MetaMethodIndex for Class B, but I think in that case as long as 
> the Class B was strongly referenced the index itself would keep Class 
> A referenced.
>
> In environments where lots of scripts are being parsed and run and 
> references to the Class are not retained, it might be worth having a 
> way to programmatically initiate the cleanup so as not to have to wait 
> for the Soft References to be collected.  The extra performance costs 
> of clearing a few references might not be as high as consistently 
> hitting the upper heap limit constantly.  It is something I have 
> looked at for GROOVY-7646 [2].  Parsed groovy classes should be 
> collectible by default without any intervention, but there may be 
> cases where an API to help speed the removal might be useful too.
>
>
> [1] 
> https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50 
>
> [2] https://github.com/apache/groovy/pull/325
>
>
> On Mon, May 16, 2016 at 8:01 AM, Alain Stalder <astalder@span.ch 
> <mailto:astalder@span.ch>> wrote:
>
>     My time here is running up, other things to attend to, so here is what
>     I wrote about the current state of class loading and garbage
>     collection
>     in Groovy in the just updated user manual of Grengine:
>
>     https://www.grengine.ch/manual.html#the-cost-of-session-separation
>     --
>     ==== The Cost of Session Separation
>
>     Although loading classes from bytecode obtained from compiling
>     Groovy scripts
>     is a lot less expensive than compiling them (plus afterwards also
>     loading the
>     resulting bytecode), it is still somewhat more expensive than one
>     might naively
>     expect and there are a few things to be aware of when operating
>     that way.
>
>     In the following, I will simply call classes compiled by the
>     Groovy compiler
>     from Groovy scripts/sources _Groovy classes_ and classes compiled
>     by the Java
>     compiler from Java sources _Java classes_.
>
>     * *Class Loading* +
>       Experimentally, loading of a typical Groovy class is often about
>     10 times
>       slower than loading a Java class with similarly complex source
>     code, but
>       both are relatively expensive operations (of the order of a
>     millisecond
>       for a small Groovy class, to give a rough indication). For Java
>     classes,
>       this is apparently mainly expensive because some security checks
>     have to
>       be made on the bytecode. For Groovy classes, it is mainly expensive
>       because some meta information is needed to later efficiently
>     call methods
>       dynamically, and the like.
>     * *Garbage Collection* +
>       Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_
>     (Java 8
>       and later) plus some associated data on the Heap, at least for
>     Groovy
>       classes the latter is normally the case (meta information).
>     Whereas for
>       Java classes, unused classes appear to be usually garbage
>     collected from
>       PermGen/Metaspace continuously, with Groovy classes this
>     experimentally
>       does not happen before PermGen/Metaspace or the Heap reach a
>     configured
>       limit. Why exactly this is so and whether it is easy to change
>     and whether
>       it will change in the future, is difficult to answer for me, I
>     find the
>       code around it is rather convoluted, hard to untangle. Note that
>     by default
>       on Java VMs there is typically no limit set for Metaspace (but
>     there is
>       for PermGen), so setting a limit is crucial in practice when
>     using Groovy.
>     * *Garbage Collection Bugs* +
>       In the past, several Groovy versions had failed at garbage
>     collecting
>       Groovy classes and their class loaders, resulting finally in an
>       `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the
>     Heap,
>       whichever limit was reached first. If when you are reading this,
>     Groovy
>       2.4.6 is (still) the newest version, make sure you set the
>     system property
>       `groovy.use.classvalue=true` in the context of Grengine. Note
>     that under
>       different circumstances, like the one described in
>     https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591
>     <https://issues.apache.org/jira/browse/GROOVY-7591%5BGROOVY-7591>:
>       Use of ClassValue causes major memory leak] you would instead
>     have had to
>       set it to false! That Groovy bug is actually in turn due to a bug in
>       Oracle/OpenJDK Java VMs regarding garbage collection under some
>       circumstances, more precisely a bug in a new feature (`ClassValue`)
>       introduced in order to make thing easier(!) for dynamic
>     languages in the
>       Java VM, see
>     https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353]
>     <https://bugs.openjdk.java.net/browse/JDK-8136353%5BJDK-8136353%5D>.
>
>     So, if you want to use session separation with Greninge (or
>     otherwise want
>     to load many Groovy classes repeately), first set a limit on
>     PermGen/Metaspace,
>     then verify that classes can be garbage collected in an
>     environment close to
>     production and that throughput under load would be sufficient
>     (despite the
>     relatively slow class loading performance of Groovy (and Java)
>     classes in the
>     Java VM) and then use it. And don't forget to repeat this at least
>     when you
>     upgrade Groovy to a new version, but possibly also when you
>     upgrade Java.
>
>     Or see the next section for an alternative...
>     --
>
>     PS: By the way, very funny how Jochen Theodorou "garbage
>     collected" what I
>     wrote about PhantomReference to a "[...]"...
>
>     Good luck with Groovy garbage collection.
>
>     Alain
>
>


Mime
View raw message