Hi list

On 21 Apr 2017, at 14.33, Andres Almiray <aalmiray@gmail.com> wrote:

I had a brief chat with Jochen a few days ago regarding this topic.

Until now the usage of a Set or some other data structure was not really that important.
If Groovy switches to a SortedSet then results may be more predictable, but, is it in the benefit of the majority?

For a reproducible build, LinkedHashSet would make more sense (as it preservers order, rather than sorting), and doesn't rely on comparability. For subtle dependencies like order of parent classes, this would be preferable.

Fixing the ordering can't possibly be worse than having it be random/unstable.

What about opening the door for certain strategies to be injected from the outside, thus Gradle may inject certain customizations during the compiler process that make sense to the build tool but not for a Groovy developer compiling with the default Groovy compiler settings. This would require looking for places where custom strategies may be required (such as the particular collection to keep track of names discussed earlier), perhaps relying on ServiceLoader or some other mechanism to discover custom strategies or pick the default ones during compiler bootstrap.

That sounds like a heavyweight solution compared to using LinkedHashSet but I might not grasp the full problem.

Also: How about the timestamps embedded into classes' initialization logic, such as  12: putstatic     #216                // Field __timeStamp__239_neverHappen1490860918788:J Are those still around? Won't they also prevent repeatable builds?


Java Champion; Groovy Enthusiast
What goes up, must come down. Ask any system administrator.
There are 10 types of people in the world: Those who understand binary, and those who don't.
To understand recursion, we must first understand recursion.

On Fri, Apr 21, 2017 at 2:24 PM, Graeme Rocher <graeme.rocher@gmail.com> wrote:
Big +1 one for making the compiler reproducible.

I think the usage of Sets and HashMap has an impact on Groovydoc too
because Groovydoc generates different output each time it is run.

For example the "extends" and "implements" output in Groovydoc changes
the order of the classes each time it is run. This must be down to
internal use of unordered sets or hash maps.

Regarding classpath (compiler vs compile classpath), in many cases AST
transforms reference classes from libraries that are on the "compile"
classpath. How would you deal with this case? Have them in both


On Sun, Apr 9, 2017 at 11:53 AM, C├ędric Champeau <cchampeau@apache.org> wrote:
> Hi team!
> I would like to setup some additional goals for the next release of Groovy.
> As you may know, Gradle 3.5, released tomorrow, will ship with a build cache
> [1]. But Groovy is causing us some troubles, because the output of
> compilation is not reproducible. In other words, for the same inputs, we can
> randomly get a different output. To be clear, while the output of Groovy is
> semantically correct, we cannot guarantee that for the same sources, the
> same bytecode, byte to byte, is going to be generated. This is a big problem
> for Gradle, because it affects the cache key, and a cache miss has terrible
> consequences: since we need to rebuild everything when a build script
> classpath changes, a change in the output of a build script bytecode means
> we need to invalidate a full build...
> As an illustration, I fixed 2 issues this week [2] and [3]. But it's not
> enough. We should revise our usage of maps and sets, and use their linked
> counterparts when it makes sense. It, alone, cannot guarantee that we have
> reproducible builds. In particular, cross platform. There are hundreds of
> places where we use hash sets/maps, with objects that do not implement
> equals/hashcode, for example (and since those structures are mutable, we're
> very lucky because the default hashcode uses the system one, which is
> immutable).
> Eventually, if you have read my blog post about performance of Gradle 3.4
> [4], you would understand that the Java compiler does a much better job than
> we do as separating things required by the compiler from things required in
> user space. In particular, we at Groovy offer AST transformations, which is
> similar, but not equivalent, to what annotation processors are for javac.
> The problem is that Groovy only has a single compile classpath, which
> includes both the "compiler plugins" (AST transformations) and
> implementation dependencies of the project we compile. So we're effectively
> mixing things that shouldn't be mixed:
> - annotations for AST transformations should be on the compile classpath
> - implementation of the AST transformations should be on the AST
> transformations path (compiler classpath)
> If we don't do this, we cannot be as smart as what we do in the Java world,
> and compute what is relevant in terms of ABI (application binary interface).
> So any change to the classpath, needs to be considered a breaking change and
> we need to recompile everything. This makes the implementation of a Groovy
> incremental compiler effectively impossible. Furthermore, it prevents the
> AST transform implementors to use the libraries they want as dependencies,
> without leaking them to the user compile classpath (imagine an AST xform
> which uses Guava 18, while the user wants Guava 17 on his classpath). In
> practice, the implementation dependencies of the AST xform are only required
> at compile time, not runtime, so they should be separate.
> In short, I vote for the adoption of a new `-compilerpath` where we would
> put everything that affects the compiler itself, and live `--classpath` for
> the user space. Doing this would let tools like Gradle be much smarter about
> what they can do.
> [1] https://docs.gradle.org/3.5-rc-3/release-notes.html
> [2] https://issues.apache.org/jira/browse/GROOVY-8142
> [3] https://issues.apache.org/jira/browse/GROOVY-8148
> [4] https://blog.gradle.org/incremental-compiler-avoidance

Graeme Rocher