groovy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C├ędric Champeau <cchamp...@apache.org>
Subject Towards a better compiler
Date Sun, 09 Apr 2017 09:53:18 GMT
Hi team!

I would like to setup some additional goals for the next release of Groovy.
As you may know, Gradle 3.5, released tomorrow, will ship with a build
cache [1]. But Groovy is causing us some troubles, because the output of
compilation is not reproducible. In other words, for the same inputs, we
can randomly get a different output. To be clear, while the output of
Groovy is semantically correct, we cannot guarantee that for the same
sources, the same bytecode, byte to byte, is going to be generated. This is
a big problem for Gradle, because it affects the cache key, and a cache
miss has terrible consequences: since we need to rebuild everything when a
build script classpath changes, a change in the output of a build script
bytecode means we need to invalidate a full build...

As an illustration, I fixed 2 issues this week [2] and [3]. But it's not
enough. We should revise our usage of maps and sets, and use their linked
counterparts when it makes sense. It, alone, cannot guarantee that we have
reproducible builds. In particular, cross platform. There are hundreds of
places where we use hash sets/maps, with objects that do not implement
equals/hashcode, for example (and since those structures are mutable, we're
very lucky because the default hashcode uses the system one, which is
immutable).

Eventually, if you have read my blog post about performance of Gradle 3.4
[4], you would understand that the Java compiler does a much better job
than we do as separating things required by the compiler from things
required in user space. In particular, we at Groovy offer AST
transformations, which is similar, but not equivalent, to what annotation
processors are for javac. The problem is that Groovy only has a single
compile classpath, which includes both the "compiler plugins" (AST
transformations) and implementation dependencies of the project we compile.
So we're effectively mixing things that shouldn't be mixed:

- annotations for AST transformations should be on the compile classpath
- implementation of the AST transformations should be on the AST
transformations path (compiler classpath)

If we don't do this, we cannot be as smart as what we do in the Java world,
and compute what is relevant in terms of ABI (application binary
interface). So any change to the classpath, needs to be considered a
breaking change and we need to recompile everything. This makes the
implementation of a Groovy incremental compiler effectively impossible.
Furthermore, it prevents the AST transform implementors to use the
libraries they want as dependencies, without leaking them to the user
compile classpath (imagine an AST xform which uses Guava 18, while the user
wants Guava 17 on his classpath). In practice, the implementation
dependencies of the AST xform are only required at compile time, not
runtime, so they should be separate.

In short, I vote for the adoption of a new `-compilerpath` where we would
put everything that affects the compiler itself, and live `--classpath` for
the user space. Doing this would let tools like Gradle be much smarter
about what they can do.

[1] https://docs.gradle.org/3.5-rc-3/release-notes.html
[2] https://issues.apache.org/jira/browse/GROOVY-8142
[3] https://issues.apache.org/jira/browse/GROOVY-8148
[4] https://blog.gradle.org/incremental-compiler-avoidance

Mime
View raw message