mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lausen, Leonard" <lau...@amazon.com.INVALID>
Subject Re: Stop redistributing source code of 3rdparty dependencies to avoid licensing issues
Date Mon, 20 Jan 2020 21:17:09 GMT
Quote from Tianqi:
> The pro of doing so is that it indeed simplifies the release process, as
> these additional dependencies becomes category-B level dependencies as in
> https://www.apache.org/legal/resolved.html

Why would the dependencies become category-B level? It seems all licensing
considerations only apply to ASF distributions (source or convenience binary).
Category-B level software is software that can't be included in the source
distribution but may be included in ASF convenience binaries. 
https://www.apache.org/legal/resolved.html#binary-only-inclusion-condition

I believe we currently do not have ASF convenience binaries (though there are
some convenience binaries unrelated to ASF published on Pypi and S3 Buckets).

With respect to the source distribution, no licensing considerations apply to
non-bundled dependencies:
"LICENSE and NOTICE MUST NOT provide unnecessary information about materials
which are not bundled in the package, such as separately downloaded
dependencies." 
http://www.apache.org/legal/release-policy.html#licensing-documentation

> The con of doing so is that it brings additional burden to the users of the
> software to check the license of these dependencies, in some sense,
> including these information in the
> license actually gives an extra level of transparency.

I agree. We haven't been successful with providing this transparency yet though
given the licensing issues at every release.

Quote from Marco:
> The question at this point is whether we are allowed to differentiate
> between our main-source and hold it to the strict standards while treating
> the third party folder as dependency, where we only have to verify that the
> projects are licensed with an Apache compatible license.

I don't think so. If it's in the source distribution, it must be appropriately
declared in LICENSE and NOTICE.

> At the moment, the project already treats them different: our license
> checks exclude third party. I think this is where the disparity is coming
> from.

Indeed, we currently don't check 3rdparty code in an automated way.
I'm not sure how big of an overhead maintaining more fine-grained excludes is
(compared to current exclude all in 3rdparty). When writing the mail, I assumed
the overhead would be significant.

> I'd recommend we discuss with Apache how we can handle this
> situation: package third party code for user convenience while limiting
> responsibility. In the end, we still have to ensure that everything is
> licensed properly, so maybe we should try to align both processes to match the
> real world instead of changing the real world to match the process.

What do you mean with "changing the real world to match the process"? How to
"align both processes to match the real world"?

As PPMC member, would you be able to ask ASF for a recommendation?

If I don't misremember, our mentor Markus Weimer suggested at KDD 2019 in a
conversation that it's easiest to stop bundling non-ASF 3rdparty code.

Quote from Pedro:
> Source archives that need to download too many dependencies to build will end 
> up broken with time. I would expect source to build with a reasonable set of
> well known system dependencies.

Yes, it's a valid consideration.

Best regards
Leonard

On Fri, 2020-01-17 at 22:04 +0100, Marco de Abreu wrote:
> I agree with Tianqi. We may change our build system, but this won't free us
> from the necessity to validate the licenses of our dependencies.
> 
> The question at this point is whether we are allowed to differentiate
> between our main-source and hold it to the strict standards while treating
> the third party folder as dependency, where we only have to verify that the
> projects are licensed with an Apache compatible license.
> 
> At the moment, the project already treats them different: our license
> checks exclude third party. I think this is where the disparity is coming
> from. I'd recommend we discuss with Apache how we can handle this
> situation: package third party code for user convenience while limiting
> responsibility.
> 
> In the end, we still have to ensure that everything is licensed properly,
> so maybe we should try to align both processes to match the real world
> instead of changing the real world to match the process.
> 
> -Marco
> 
> Tianqi Chen <tqchen@cs.washington.edu> schrieb am Fr., 17. Jan. 2020, 20:44:
> 
> > I don't have an opinion, but would like to list pros and cons of doing so.
> > 
> > The pro of doing so is that it indeed simplifies the release process, as
> > these additional dependencies becomes category-B level dependencies as in
> > https://www.apache.org/legal/resolved.html
> > 
> > The con of doing so is that it brings additional burden to the users of the
> > software to check the license of these dependencies, in some sense,
> > including these information in the
> > license actually gives an extra level of transparency.
> > 
> > The copyright message in some of the dependencies are a bit unfortunate,
> > one potential way to run the check is to write a python script to go
> > through the files and detect the line Copyright and cross match and add
> > them.
> > 
> > Note that good models to follow are
> > - hadoop: https://github.com/apache/hadoop/tree/trunk/licenses
> > - flink: https://github.com/apache/flink
> > 
> > Each of the repo have a licenses folder that contains licenses, and things
> > points to them.
> > 
> > I am not a lawyer, but the case for ps-lite seems can be resolved as long
> > as we can confirm these files follows Apache-2.0, as
> > https://www.apache.org/licenses/LICENSE-2.0 only requires us to
> > redistribute
> > the license and anything in the NOTICE, but we do not have the obligation
> > to list all the copyright messages in the source content.
> > 
> > TQ
> > 
> > On Fri, Jan 17, 2020 at 11:10 AM Yuan Tang <terrytangyuan@gmail.com>
> > wrote:
> > 
> > > +1
> > > 
> > > On Fri, Jan 17, 2020 at 1:59 PM Chris Olivier <cjolivier01@gmail.com>
> > > wrote:
> > > 
> > > > +1
> > > > 
> > > > On Fri, Jan 17, 2020 at 10:19 AM Lausen, Leonard
> > > <lausen@amazon.com.invalid
> > > > wrote:
> > > > 
> > > > > Dear MXNet community,
> > > > > 
> > > > > as per recent mail on general@incubator.apache.org [1] there are
a
> > > > number
> > > > > of
> > > > > licensing issues in MXNet 1.6rc1. Based on anecdotal evidence I
> > believe
> > > > > there
> > > > > has been no release so far without any licensing issues, which is
a
> > > > > blocker to
> > > > > MXNet graduating from it's incubating status. One contributing factor
> > > is
> > > > > that we
> > > > > bundle 3rdparty source code in our releases [2].
> > > > > 
> > > > > One key factor is that 3rdparty projects don't always enforce
> > licensing
> > > > > best
> > > > > practice in the way we do. For example, 3rdparty/ps-lite doesn't
> > > enforce
> > > > > license
> > > > > headers in the source files and there has been confusion about the
> > > > license
> > > > > of
> > > > > recent contributions by ByteDance (See [1]).
> > > > > 
> > > > > To avoid such licensing issues in MXNet releases a simple solution
is
> > > to
> > > > > stop
> > > > > distributing the 3rdparty code in our source releases. Instead, we
> > can
> > > > > adapt our
> > > > > buildsystem to download 3rdparty code as part of the build
> > > configuration
> > > > > process. CMake makes this very easy with the FetchContent module
[3].
> > > > > 
> > > > > For development purpose involving changes to the 3rdparty source
or
> > > build
> > > > > systems that can't access the internet, there are easy means for
> > > > > specifying the
> > > > > location of local sources (instead of downloading), via the
> > > > > FETCHCONTENT_SOURCE_DIR_<someName> variable [4].
> > > > > 
> > > > > Would there be any concerns about such approach? Obviously it can
> > only
> > > be
> > > > > fully
> > > > > implemented as soon as the CMake build system is feature complete
and
> > > the
> > > > > Makefile build can be dropped. (Note that the Makefile build is being
> > > > > deprecated
> > > > > and removed as part of MXNet 2 roadmap [5])
> > > > > 
> > > > > Best regards
> > > > > Leonard
> > > > > 
> > > > > [1]:
> > > > > 
> > > > > 
> > https://lists.apache.org/thread.html/rb83ff64bdac464df2f0cf2fe8fb4c6b9d3b8fa62b645763dc606045f%40%3Cgeneral.incubator.apache.org%3E
> > > > > [2]: See the .tar.gz files at
> > > > > https://incubator.apache.org/clutch/mxnet.html
> > > > > [3]: https://cmake.org/cmake/help/latest/module/FetchContent.html
> > > > > [4]: https://cmake.org/pipermail/cmake/2019-June/069709.html
> > > > > [5]: https://github.com/apache/incubator-mxnet/issues/16167
> > > > > 
Mime
View raw message