groovy-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Goo Mail <goomai...@gmail.com>
Subject Re: Groovy parser very slow when using ImportCustomizers
Date Thu, 17 Dec 2015 02:59:38 GMT
Jamie and Cédric,

Following up on this earlier conversation about how a shared
CompilerConfiguration object impacts parsing performance with
ImportCustomizers, I believe I figured out what is going on:

The original executive summary was that applying many ImportCustomizers to
a script was very slow (costing us hundreds of ms in parse time per
script). If I share a single CompilerConfiguration object across multiple
scripts, the cost of using the ImportCustomizers suddenly goes down to
almost zero on the second and subsequent parses. This is with Groovy 2.3.6,
for what it is worth.

I have since found that sharing the entire CompilerConfiguration is not
even necessary, since it is only the ImportCustomizer itself that "helped"
with performance. In other words, if you create multiple
CompilerConfiguration objects but attach the same ImportCustomizer to both,
then you'll see the higher performance demonstrated in my test case below.
But if you create an entirely new ImportCustomizer every time that has an
identical set of imports as the original ImportCustomizer object, then your
performance will take a big hit.

As far as I can tell, this speed differential occurs because the
ImportCustomizer is mutable, and when I run the Groovy parser with an
ImportCustomizer attached, the ImportCustomizer object is actually changed
(or, more precisely, the objects it references are changed). I think this
happens in ResolveVisitor as part of the class resolution process.

That also presumably explains why the second and subsequent parses are so
fast (classes are already resolved)...although I admit that the 300 ms hit
for parsing 35 classes in the very first invocation still seems slow.

I came to the above conclusion about mutability of the ImportCustomizer
after stumbling across a race condition: if I create a single
ImportCustomizer and attach it to scripts that are being parsed
simultaneously on different threads, I start getting random errors with my
script class's constructor being unable to resolve many of the class names
listed in the ImportCustomizer imports. The errors are somewhat mangled by
having "." randomly changed to "$"' in parts of the package name (even
though no inner classes are involved). This looks to be some side-effect of
the ResolveVisitor#resolve() process, since I see it replacing some strings
and calling setName() on the node.

Whatever manipulation is going on there was apparently not intended to
performed on shared objects, but I was able to make it work by serializing
all calls to the Groovy parser until at least one good parse has been
completed. After the first good parse, the types in the ImportCustomizer
seem to be fully resolved and I can again parse scripts on multiple thread
simultaneously with the shared ImportCustomizer.

With all of that having been written: is sharing the same ImportCustomizer
across different classloaders and parse sessions still a good idea? I like
the performance improvement that sharing the ImportCustomizer provides, but
the mutability aspect seems like more of an inadvertent feature and/or
issue that might not have been intended by the Groovy designers.

Thanks for any insight any of you can provide!

Scott


On Fri, Nov 20, 2015 at 4:07 PM, Goo Mail <goomail59@gmail.com> wrote:

> Hi Jamie,
>
> If it's caching something, it seems unrelated to the Groovy script itself.
>
> For example, I tried adjusting the code to append additional statements to
> the Groovy script on every iteration. When doing this, the parse times
> increase proportionately to the script length. However, even in this
> scenario, the "slow" parse times were still uniformly ~300 ms slower than
> the "fast" parse, depending on whether the CompilerConfiguration is a new
> object or one already seen before.
>
> Perhaps it's related to excess loading/resolving of those imported
> classes? (Any caching of the imported classes is desired behavior on my
> part anyway!) But the CompilerConfiguration influence is still mysterious
> to me.
>
> Scott
>
>
> On Fri, Nov 20, 2015 at 3:50 PM, Jamie Echlin <jamie.echlin@gmail.com>
> wrote:
>
>> I am quite suspicious of your test code, because the first of the fast
>> runs is actually as slow as the slow runs. Looks like there is some caching
>> in play, but I can't see where from the code.
>>
>> On Fri, Nov 20, 2015 at 8:14 PM, Goo Mail <goomail59@gmail.com> wrote:
>>
>>> Cédric,
>>>
>>> Thanks for your response. In this case, no, I am adding fully-qualified
>>> single class names (see the pastebin <http://pastebin.com/V1ijXPjC>
>>>  example).
>>>
>>> Scott
>>>
>>> On Fri, Nov 20, 2015 at 3:11 PM, Cédric Champeau <
>>> cedric.champeau@gmail.com> wrote:
>>>
>>>> Are the imports you are adding star imports? If so, it is likely the
>>>> problem. If you know the exact classes that you want to import, it is
>>>> better to import them as fully qualified class names. I had a similar issue
>>>> with Gradle recently, and it has been partly mitigated with this commit:
>>>> https://github.com/gradle/gradle/commit/42e9f41e41880178aa85ebac6956690c711c7130
>>>>
>>>> I'm pretty sure we can improve resolution time by providing better
>>>> knowledge to the compiler of what imports are possible.
>>>>
>>>> 2015-11-20 21:05 GMT+01:00 Goo Mail <goomail59@gmail.com>:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I am using Groovy embedded into an application which runs
>>>>> user-provided scripts. I see a significant parsing performance issue
when
>>>>> the compiler is configured to add ~30 automatic imports via an
>>>>> ImportCustomizer, but I can't figure out why it is happening.
>>>>>
>>>>> For various unrelated reasons, we segregate individual user scripts by
>>>>> creating a new GroovyClassLoader for each one and parsing each script
in
>>>>> its own context. (These classloaders all chain to a single parent Groovy
>>>>> classloader that contains some shared functionality.)
>>>>>
>>>>> With the scripts segregated like this, I noticed that even an empty
>>>>> script takes 300+ ms to parse, which seems excessive. If I remove the
>>>>> automatic imports, the parse takes <10 ms instead. But I also found
>>>>> something bizarre in how the CompilerConfiguration seems to influence
the
>>>>> parse time.
>>>>>
>>>>> I am 99% sure that the parsed class is not cached, since I turned off
>>>>> all of those options and I also stepped into the parse() function to
>>>>> confirm that it was walking through all of the phases.
>>>>>
>>>>> Could it be that the extra time is actually consumed by
>>>>> loading/resolving the imported classes, and not something Groovy is
>>>>> manipulating during the parse itself? That seems possible (and is
>>>>> potentially understandable, at least on the first invocation), except
for
>>>>> one weirdness:
>>>>>
>>>>> The parse time problem appears tied to the specific instance of a
>>>>> CompilerConfiguration object being provided to the child classloader.
For
>>>>> example, when creating the child classloader, if I supply the same
>>>>> CompilerConfiguration object that was provided to the parent classloader,
>>>>> the parse time suddenly get very fast!
>>>>>
>>>>> But if I instead construct a new CompilerConfiguration object for
>>>>> every script, the parse is very slow, even though the newly-constructed
CC
>>>>> is identical to the previous one, and even though each script is still
>>>>> using its own separate classloader.
>>>>>
>>>>> What is going on (and how do I fix it)? Here is an example that you
>>>>> can run in groovysh that demonstrates the problem:
>>>>>
>>>>> http://pastebin.com/V1ijXPjC
>>>>>
>>>>> Here is the output:
>>>>>
>>>>> ===> true
>>>>> Showing slow parse time
>>>>> Parse time: 281 ms
>>>>> Parse time: 260 ms
>>>>> Parse time: 280 ms
>>>>> Parse time: 277 ms
>>>>> Parse time: 259 ms
>>>>> Parse time: 257 ms
>>>>> Parse time: 279 ms
>>>>> Parse time: 260 ms
>>>>> Parse time: 278 ms
>>>>> Parse time: 282 ms
>>>>> ===> null
>>>>> Showing fast parse time
>>>>> Parse time: 280 ms
>>>>> Parse time: 3 ms
>>>>> Parse time: 3 ms
>>>>> Parse time: 2 ms
>>>>> Parse time: 3 ms
>>>>> Parse time: 2 ms
>>>>> Parse time: 2 ms
>>>>> Parse time: 3 ms
>>>>> Parse time: 3 ms
>>>>> Parse time: 2 ms
>>>>> ===> null
>>>>>
>>>>> Thanks for any insight you can provide!
>>>>>
>>>>> - Scott
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message