www-legal-discuss mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lahoda <lah...@gmail.com>
Subject Re: LICENSE and NOTICE file content
Date Tue, 26 Jun 2018 08:43:47 GMT
(As the NetBeans has (among others) a library for reading classfiles, I
guess this discussion also relates to it, and I'd like to share some of my
thoughts.)

On Tue, Jun 26, 2018 at 7:15 AM, Alex Harui <aharui@adobe.com.invalid>
wrote:

> AIUI, our primary objectives for open source are about “sharing” (and
> open-ness in general and “security/safety”.  So yeah, there is some
> overhead to being an open source project.  We want source packages that
> folks can use in other ways, and that folks can use without fear of getting
> infected by a virus.  We generally recommend that folks use our code by
> building from sources.  If the source package contains executable code,
> there is always a chance that some evil person will find a way to exploit
> that.
>
>
>
> So, IMO, even a byte-code manipulation tool that has test data, probably
> had that test data compiled from some source.  We should make that source
> available so that folks can try different variations, or even fix a bug
> that’s “been there forever that nobody found until just now”.
>

I think one should be very (very, very) careful when modifying test data.
One needs to be absolutely sure the test still tests what it was testing
before, otherwise "fixing" a "bug" in the test data may actually make the
test useless. "Negative" tests (tests that verify that something (usually a
crash/exception) does not happen) are particularly prone to such an
accidental invalidation.


>
>
> However, I don’t know of any Apache policy or convention that dictates
> that the source for the test byte code must be provided in the same package
> as the byte-code manipulation tool.  You could create a separate release of
> the source for the test byte code and never release it again if it never
> changes.  Then that would be an upstream dependency for the byte-code
> manipulation tool source package.  But if it were up to me, the tool’s
> source package still would not contain the byte
>

I assume when a bug is fixed, a new test would (ideally) be written, which
means new set of test data, which means new release of the test data,
right? So a bug cannot absolutely be fixed (with a test) quicker than in 3
days (3+3 days for podlings)? (And one needs to be careful to not change
the existing test data in convenience binaries, just add the new one, of
course.)


> code unless there is some way to solve the “security/safety” goal.  Maybe
> it is good enough to give the file a different suffix so it appears as a
> non-executable file.  But I would probably just have the tool’s source
> package build script download the convenience binary of the upstream test
> source package.
>
>
>
> That is extra overhead for sure, but I don’t think that is ‘impractical’.
> And I still wouldn’t hold up any release for this kind of issue.
> Incrementally make improvements in subsequent releases.  Create a test-data
> source release.  Then adjust the main source package to download the test
> jar.
>

I may be too pessimistic, but in my experience when creating a test is more
complicated, the probability of having a test decreases. And not having a
test feels like a sub-optimal software engineering practice.

FTR, I think there are multiple variants to avoid having classfiles in the
repository, like maybe using jcod (not sure if that's OK or not); at the
same time, I think having an approach that does not discourage proper
engineering practices has benefits.

Jan


>
>
> My 2 cents,
>
> -Alex
>
>
>
> *From: *David Jencks <david.a.jencks@gmail.com>
> *Reply-To: *"legal-discuss@apache.org" <legal-discuss@apache.org>
> *Date: *Monday, June 25, 2018 at 1:31 PM
> *To: *"legal-discuss@apache.org Discuss" <legal-discuss@apache.org>
> *Subject: *Re: LICENSE and NOTICE file content
>
>
>
> I don’t know what function these files serve here, but IMO a blanket
> condemnation of precompiled classes for test data in apache source releases
> make certain kinds of projects impractical to develop at apache.  If I was
> developing a byte code manipulation tool, I would want as test data a wide
> variety of unchanging byte code samples.  For instance, one category might
> result from a particular AL2  java file compiled with every possible
> compiler I could find, possibly also modified by every other byte code
> manipulation tool I could find.  I’d expect that I’d also want saved
> “output” byte code to check that the output doesn’t change.  Building such
> binary artifacts as part of the build completely eliminates their
> usefulness as test data. Of course how the byte code was constructed needs
> to be carefully documented.
>
>
>
> David Jencks
>
>
>
> On Jun 25, 2018, at 12:44 PM, Sean Owen <srowen@apache.org> wrote:
>
>
>
> Yes the code in there is ALv2 licensed; appears to be either created for
> Spark or copied from Hive. Yes, irrespective of the policy issue, it's
> important to be able to recreate these JARs somehow, and I don't think we
> have the source in the repo for all of them (at least, the ones that
> originate from Spark). That much seems like a must-do.
>
>
>
> After that, seems worth figuring out just how hard it is to build these
> artifacts from source. If it's easy, great. If not, either the test can be
> removed or we figure out just how hard a requirement this is.
>
> On Mon, Jun 25, 2018 at 11:34 AM Alex Harui <aharui@adobe.com.invalid>
> wrote:
>
> I am not an official answer person, but IMO, the first question is:  “Is
> the source for TestSerDe.jar ‘open source’ under an ALv2-compatible
> license?”.
>
>
>
> If “yes”, then supply the source in the source release and not the JAR.
> One of the reasons for “no compiled code in a source release” is that it is
> very difficult to verify that compiled code is “correct” and not corrupted,
> infected with a virus, etc.
>
>
>
> If “no”, then treat as a 3rd-party dependency.  Which may mean you can’t
> use it or need to treat it as optional, or a runtime dependency.
>
>
>
> The related question is:  How do folks modify this JAR?  If it was a JPEG,
> there are plenty of JPEG modification tools.  There really aren’t JAR
> modification tools that modify JARs internal .class files, you really
> should use the source files.  I am still surprised/puzzled by the answer in
> the thread you linked to.  It still seems in both cases that a “binary” is
> being supplied for “convenience”.  IMO, there should be very few, if any,
> things in an Apache source repo that are “unmodifiable”.
>
>
>
> The “workaround” of renaming the .jar or .class files to something else so
> it isn’t seen as executable code seems like it still doesn’t fully meet the
> spirit of an open source release, either, but better than shipping
> executable code in a source package.
>
>
>
> On the other hand, I would not hold up a release for an issue like this.
> Fix it in some future release.
>
>
>
> My 2 cents,
>
> -Alex
>
>
>
>
>

Mime
View raw message