I don’t know what function these files serve here, but IMO a blanket condemnation of precompiled classes for test data in apache source releases make certain kinds of projects impractical to develop at apache.  If I was developing a byte code manipulation tool, I would want as test data a wide variety of unchanging byte code samples.  For instance, one category might result from a particular AL2  java file compiled with every possible compiler I could find, possibly also modified by every other byte code manipulation tool I could find.  I’d expect that I’d also want saved “output” byte code to check that the output doesn’t change.  Building such binary artifacts as part of the build completely eliminates their usefulness as test data. Of course how the byte code was constructed needs to be carefully documented.

David Jencks

On Jun 25, 2018, at 12:44 PM, Sean Owen <srowen@apache.org> wrote:

Yes the code in there is ALv2 licensed; appears to be either created for Spark or copied from Hive. Yes, irrespective of the policy issue, it's important to be able to recreate these JARs somehow, and I don't think we have the source in the repo for all of them (at least, the ones that originate from Spark). That much seems like a must-do.

After that, seems worth figuring out just how hard it is to build these artifacts from source. If it's easy, great. If not, either the test can be removed or we figure out just how hard a requirement this is.

On Mon, Jun 25, 2018 at 11:34 AM Alex Harui <aharui@adobe.com.invalid> wrote:

I am not an official answer person, but IMO, the first question is:  “Is the source for TestSerDe.jar ‘open source’ under an ALv2-compatible license?”.


If “yes”, then supply the source in the source release and not the JAR.  One of the reasons for “no compiled code in a source release” is that it is very difficult to verify that compiled code is “correct” and not corrupted, infected with a virus, etc.


If “no”, then treat as a 3rd-party dependency.  Which may mean you can’t use it or need to treat it as optional, or a runtime dependency.


The related question is:  How do folks modify this JAR?  If it was a JPEG, there are plenty of JPEG modification tools.  There really aren’t JAR modification tools that modify JARs internal .class files, you really should use the source files.  I am still surprised/puzzled by the answer in the thread you linked to.  It still seems in both cases that a “binary” is being supplied for “convenience”.  IMO, there should be very few, if any, things in an Apache source repo that are “unmodifiable”.


The “workaround” of renaming the .jar or .class files to something else so it isn’t seen as executable code seems like it still doesn’t fully meet the spirit of an open source release, either, but better than shipping executable code in a source package.


On the other hand, I would not hold up a release for an issue like this.  Fix it in some future release.


My 2 cents,