groovy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cédric Champeau <cchamp...@apache.org>
Subject Re: Towards a better compiler
Date Fri, 21 Apr 2017 17:46:18 GMT
2017-04-21 14:33 GMT+02:00 Andres Almiray <aalmiray@gmail.com>:

> I had a brief chat with Jochen a few days ago regarding this topic.
>
> Until now the usage of a Set or some other data structure was not really
> that important.
> If Groovy switches to a SortedSet then results may be more predictable,
> but, is it in the benefit of the majority?
> What about opening the door for certain strategies to be injected from the
> outside, thus Gradle may inject certain customizations during the compiler
> process that make sense to the build tool but not for a Groovy developer
> compiling with the default Groovy compiler settings. This would require
> looking for places where custom strategies may be required (such as the
> particular collection to keep track of names discussed earlier), perhaps
> relying on ServiceLoader or some other mechanism to discover custom
> strategies or pick the default ones during compiler bootstrap.
>

I don't see the point honestly. The fact is that the compiler, for a
specific set of sources + compile classpath, should always produce the same
output. Especially on the same machine. If it doesn't, then the compiler is
not predictable. It's a bug that needs to be fixed. It has nothing to do
with Gradle, and *all* users would benefit from this. Gradle is an
important use case because it currently defeats our caching, but it's not
the only one. IDE indexing is another example. It's a serious issue, and we
need to consider this a serious bug.

BTW, we don't need to _sort_. We need to make sure that for the same input,
we have the same output. Especially, order of interfaces in declaration
type matter (and they are reproducible today), We *must not* reorder them,
or it would change semantics (typically for traits). I have fixed the bugs
we, in the Gradle team, have identified, but my email was there to mention
that we should take better care of this, because we do a pretty bad job at
checking that the behavior of the compiler is deterministic.

Regarding AST transformations classpath, I had actually forgotten that
Gradle integrates directly with the compiler, so can use the 2 different
"classpath". But AFAIK, our CLI doesn't. This should be easy to fix.

>
> Regarding the 2nd query on referenced classes by AST. Yes, you would be
> forced to define the references classes in both classpaths.
>
> Cheers,
> Andres
>
> -------------------------------------------
> Java Champion; Groovy Enthusiast
> http://andresalmiray.com
> http://www.linkedin.com/in/aalmiray
> --
> What goes up, must come down. Ask any system administrator.
> There are 10 types of people in the world: Those who understand binary,
> and those who don't.
> To understand recursion, we must first understand recursion.
>
> On Fri, Apr 21, 2017 at 2:24 PM, Graeme Rocher <graeme.rocher@gmail.com>
> wrote:
>
>> Big +1 one for making the compiler reproducible.
>>
>> I think the usage of Sets and HashMap has an impact on Groovydoc too
>> because Groovydoc generates different output each time it is run.
>>
>> For example the "extends" and "implements" output in Groovydoc changes
>> the order of the classes each time it is run. This must be down to
>> internal use of unordered sets or hash maps.
>>
>> Regarding classpath (compiler vs compile classpath), in many cases AST
>> transforms reference classes from libraries that are on the "compile"
>> classpath. How would you deal with this case? Have them in both
>> places?
>>
>> Cheers
>>
>> On Sun, Apr 9, 2017 at 11:53 AM, Cédric Champeau <cchampeau@apache.org>
>> wrote:
>> > Hi team!
>> >
>> > I would like to setup some additional goals for the next release of
>> Groovy.
>> > As you may know, Gradle 3.5, released tomorrow, will ship with a build
>> cache
>> > [1]. But Groovy is causing us some troubles, because the output of
>> > compilation is not reproducible. In other words, for the same inputs,
>> we can
>> > randomly get a different output. To be clear, while the output of
>> Groovy is
>> > semantically correct, we cannot guarantee that for the same sources, the
>> > same bytecode, byte to byte, is going to be generated. This is a big
>> problem
>> > for Gradle, because it affects the cache key, and a cache miss has
>> terrible
>> > consequences: since we need to rebuild everything when a build script
>> > classpath changes, a change in the output of a build script bytecode
>> means
>> > we need to invalidate a full build...
>> >
>> > As an illustration, I fixed 2 issues this week [2] and [3]. But it's not
>> > enough. We should revise our usage of maps and sets, and use their
>> linked
>> > counterparts when it makes sense. It, alone, cannot guarantee that we
>> have
>> > reproducible builds. In particular, cross platform. There are hundreds
>> of
>> > places where we use hash sets/maps, with objects that do not implement
>> > equals/hashcode, for example (and since those structures are mutable,
>> we're
>> > very lucky because the default hashcode uses the system one, which is
>> > immutable).
>> >
>> > Eventually, if you have read my blog post about performance of Gradle
>> 3.4
>> > [4], you would understand that the Java compiler does a much better job
>> than
>> > we do as separating things required by the compiler from things
>> required in
>> > user space. In particular, we at Groovy offer AST transformations,
>> which is
>> > similar, but not equivalent, to what annotation processors are for
>> javac.
>> > The problem is that Groovy only has a single compile classpath, which
>> > includes both the "compiler plugins" (AST transformations) and
>> > implementation dependencies of the project we compile. So we're
>> effectively
>> > mixing things that shouldn't be mixed:
>> >
>> > - annotations for AST transformations should be on the compile classpath
>> > - implementation of the AST transformations should be on the AST
>> > transformations path (compiler classpath)
>> >
>> > If we don't do this, we cannot be as smart as what we do in the Java
>> world,
>> > and compute what is relevant in terms of ABI (application binary
>> interface).
>> > So any change to the classpath, needs to be considered a breaking
>> change and
>> > we need to recompile everything. This makes the implementation of a
>> Groovy
>> > incremental compiler effectively impossible. Furthermore, it prevents
>> the
>> > AST transform implementors to use the libraries they want as
>> dependencies,
>> > without leaking them to the user compile classpath (imagine an AST xform
>> > which uses Guava 18, while the user wants Guava 17 on his classpath). In
>> > practice, the implementation dependencies of the AST xform are only
>> required
>> > at compile time, not runtime, so they should be separate.
>> >
>> > In short, I vote for the adoption of a new `-compilerpath` where we
>> would
>> > put everything that affects the compiler itself, and live `--classpath`
>> for
>> > the user space. Doing this would let tools like Gradle be much smarter
>> about
>> > what they can do.
>> >
>> > [1] https://docs.gradle.org/3.5-rc-3/release-notes.html
>> > [2] https://issues.apache.org/jira/browse/GROOVY-8142
>> > [3] https://issues.apache.org/jira/browse/GROOVY-8148
>> > [4] https://blog.gradle.org/incremental-compiler-avoidance
>>
>>
>>
>> --
>> Graeme Rocher
>>
>
>

Mime
View raw message