mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kellen sunderland <kellen.sunderl...@gmail.com>
Subject Re: Publishing Scala Package/namespace change
Date Sun, 11 Mar 2018 20:54:03 GMT
"Why refactoring and deprecating means separating version from mxnet core?
Apache Spark MLLib refactors and deprecates a lot (e.g., they deprecates
RDD API), our C API also deprecates things, remember there are a bunch of
xxxEx in c_api.h?"

Did ML lib increase their major version after deprecating RDD?

"They will. Scala API runs auto code-generation to extract Symbol method
from MXNet core. For example, users can write and compile
Symbol.NewOperator with one Scala API version, but they cannot run it with
an mxnet-core .so which does not have NewOperator / or have NewOperator
with different args."

Not sure I fully understand the scenario you're describing here.  Is this
the case where a user writes a new operator against one version of
libmxnet.so and then runs it on an older version?  In this case they'd need
to set a dependency on the current libmxnet.so ABI that they're running
against, and ensure that their jar was using that version or newer.  This
is the goal of SemVer per interface.

"By doing major version change to Scala API, we remind users 'hey, be
careful, we have something incompatible!' But then what?"
They either choose to update their package and then fix potential breaking
API changes (the likely case), or they stick with the current version.

"Users get more confused with the version mapping. And it introduces
overhead to maintain."
I'm not sure why users even need to know about the version mapping.  If I'm
only interested in the Scala package from maven, why do I care which
version of libmxnet.so I'm using?



On Sun, Mar 11, 2018 at 8:06 PM, YiZhi Liu <eazhi.liu@gmail.com> wrote:

> >
> > Changing namespaces is one example of a required major version change,
> but
> > there are more reasons like general refactoring or some deprecated APIs
> > just being hard to maintain.
>
> Why refactoring and deprecating means separating version from mxnet core?
> Apache Spark MLLib refactors and deprecates a lot (e.g., they deprecates
> RDD API), our C API also deprecates things, remember there are a bunch of
> xxxEx in c_api.h?
>
> They won't get a strange error, assuming we're talking about Scala users
> > who are upgrading from a package with the same namespace they will rely
> on
> > the package manager to give them an update which should be painless.
>
> They will. Scala API runs auto code-generation to extract Symbol method
> from MXNet core. For example, users can write and compile
> Symbol.NewOperator with one Scala API version, but they cannot run it with
> an mxnet-core .so which does not have NewOperator / or have NewOperator
> with different args.
>
> By doing major version change to Scala API, we remind users 'hey, be
> careful, we have something incompatible!' But then what? Users get more
> confused with the version mapping. And it introduces overhead to maintain.
>
> @Chris, I think we can have two separate votes.
>
>
> 2018-03-11 9:19 GMT-07:00 Chris Olivier <cjolivier01@gmail.com>:
>
> > Ok, so why don’t we have two votes?
> >
> > 1) change namespace is a separate vote since it’s a code change and has
> > different voting rules (can be vetoed)
> >
> > 2) whether to disconnect non-C-API versioning from C-API versioning and
> > have parallel versioning of all non-C APIs (process rule, so majority, I
> > think is the rule, right?)
> >
> > -Chris
> >
> > On Sun, Mar 11, 2018 at 8:46 AM kellen sunderland <
> > kellen.sunderland@gmail.com> wrote:
> >
> > > Sorry, the namespace should have been 'org.apache.mxnet' with the
> > artifact
> > > as 'mxnet-incubating'.
> > >
> > > On Sun, Mar 11, 2018 at 4:44 PM, kellen sunderland <
> > > kellen.sunderland@gmail.com> wrote:
> > >
> > > > YiZhi, In general I agree that your points and examples are the ideal
> > > > case, but in the MXNet situation there are some trade-offs we have to
> > > > make.  Let me try to specifically answer your points:
> > > >
> > > > "Do you mean we have different version for 'ml.dmlc' namespace and
> > > > 'org.apache' namespace?"
> > > > No I am not trying to saying that. I believe Marco, Naveen and I are
> > all
> > > > proposing we use a single org.apache.incubating.mxnet namespace
> moving
> > > > forward, which would require a major version change to our product
> API
> > > > under our current versioning scheme.  Marco and I are proposing we
> > apply
> > > > this MV change _only_ to the scala package's API.
> > > >
> > > > "How to tell which Scala API version works with which MXNet core
> > version?
> > > > By document?"
> > > > Yes users will be able to tell via the website, release docs, maven
> > > > package information, pom file, etc.
> > > >
> > > > "How many users will read the whole document and carefully pair the
> > > > version id before they run into a strange error and give up?"
> > > > They won't get a strange error, assuming we're talking about Scala
> > users
> > > > who are upgrading from a package with the same namespace they will
> rely
> > > on
> > > > the package manager to give them an update which should be painless.
> > > >
> > > > Secondly software developers understand that packages, not products,
> > have
> > > > versions.  They know that these versions are used to communicate when
> > > APIs
> > > > are broken.  There's examples of Apache packages doing this for
> > packages
> > > > that include multiple interfaces, for example first-party modules
> > > packaged
> > > > with the HTTP server, or log4j's language bindings (arguably quite
> > > similar
> > > > to what Naveen is doing).
> > > >
> > > > While we can debate the right way to version packages, I think
> there's
> > a
> > > > clear community decision here to get Naveen unblocked:
> > > >
> > > > (1) We continue semantically versioning across all APIs, meaning that
> > > this
> > > > change would get released with MXNet 2.*.
> > > > (2) You version package interfaces semantically and have a compatible
> > > > version mapping.
> > > > (3) Status quo, we continue to release a Scala package as-is,
> breaking
> > > > apache guidelines for artifact generation.
> > > > (4) We rely on the namespace change itself to communicate a change in
> > the
> > > > interface.  We don't consider this a major change.
> > > >
> > > > My (non-binding) preference would be for option 2.
> > > >
> > > > -Kellen
> > > >
> > > > On Sun, Mar 11, 2018 at 12:44 PM, Marco de Abreu <
> > > > marco.g.abreu@googlemail.com> wrote:
> > > >
> > > >> Changing namespaces is one example of a required major version
> change,
> > > but
> > > >> there are more reasons like general refactoring or some deprecated
> > APIs
> > > >> just being hard to maintain. Things like these happen quite
> frequently
> > > and
> > > >> it's a problem every software project has to face and find a
> solution
> > > for.
> > > >>
> > > >> Regarding ' How to tell which Scala API version works with which
> MXNet
> > > >> core
> > > >> version?': We could just bundle MXNet with the released API package
> as
> > > we
> > > >> do right now, but we would give each interface it's own version and
> > > >> publish
> > > >> them on their distribution platforms accordingly. Just an example:
> > > >> >Scala-Package -> MXNet-Version
> > > >> >> 1.0 -> 1.0
> > > >> >> 1.1 -> 1.1
> > > >> >> 2.0 -> 1.2
> > > >> >> 2.1 -> 1.3
> > > >> >> 3.0 -> 2.0
> > > >>
> > > >> > R-Package -> MXNet-Version
> > > >> >> 1.0 -> 1.0
> > > >> >> 2.0 -> 1.1
> > > >> >> 2.1 -> 1.2
> > > >> >> 2.2 -> 1.3
> > > >> >> 3.0 -> 2.0
> > > >>
> > > >> This is always an N-to-1 mapping, while N being the versions of our
> > APIs
> > > >> and 1 the MXNet Core version. From MXNets versioning perspective,
> this
> > > >> would then looking the following:
> > > >> > MXNet-Version -> APIs
> > > >> >> 1.0 -> Scala_1.0; R_1.0
> > > >> >> 1.1 -> Scala_1.1; R_2.0
> > > >> >> 1.2 -> Scala_2.0; R_2.1
> > > >> >> 1.3 -> Scala_2.1; R_2.2
> > > >> >> 2.0 -> Scala_3.0; R_3.0
> > > >>
> > > >> This would give us the liberty to develop MXNet without restricting
> us
> > > too
> > > >> much - of course, major version increments will still have to be
> > > >> considered
> > > >> carefully. I don't think that this would harm transparency too much
> > and
> > > >> there's no need to write big documentation.
> > > >>
> > > >> -Marco
> > > >>
> > > >>
> > > >> On Sun, Mar 11, 2018 at 12:16 PM, YiZhi Liu <liuyizhi@apache.org>
> > > wrote:
> > > >>
> > > >> > I have no idea how separating Scala API version can solve the
> > > >> > 'compatibility' problem. Do you mean we have different version
for
> > > >> > 'ml.dmlc' namespace and 'org.apache' namespace? Do these two
> > versions
> > > >> > have same behavior? How to tell which Scala API version works
with
> > > >> > which MXNet core version? By document? How many users will read
> the
> > > >> > whole document and carefully pair the version id before they
run
> > into
> > > >> > a strange error and give up?
> > > >> >
> > > >> > Moreover, changing namespace is an issue that is really rare
and
> > > >> > hardly happens. For other 'compatibility' problem, for example,
> the
> > > >> > class/function definitions, should handle the compatibility
> itself.
> > > >> > You'll never expect a project to have a different version for
> > changing
> > > >> > 'calculate(int)' to 'calculate(float)', it should just add a
new
> > > >> > function 'calculate(float)'.
> > > >> >
> > > >> > Regarding 'In this case the Scala interface is clearly a separate
> > > >> > entity from the C API.'. Everything can be seen as a separate
> > entity,
> > > >> > the mxnet engine, the graph description, operators, python API,
> > gluon
> > > >> > API, etc. We should think carefully what we want to provide,
and
> > what
> > > >> > our users need.
> > > >> >
> > > >> > As an example, Apache Spark, still has SparkR (R API), PySpark
> > (Python
> > > >> > API), MLLib, GraphX ... as part of its release, and have the
same
> > > >> > version as Spark core as well as its Scala/Java API.
> > > >> >
> > > >> > 2018-03-10 23:58 GMT-08:00 kellen sunderland <
> > > >> kellen.sunderland@gmail.com
> > > >> > >:
> > > >> > > +1 (non-binding) to what Marco is describing.  +1 (non-binding)
> to
> > > >> > getting the Scala bindings with the namespace change into Maven.
> > > >> > >
> > > >> > > The general best practice for SemVer, which is used by most
> > projects
> > > >> > that employ SemVer, is to apply SemVer to the public APIs of
> > packages
> > > >> that
> > > >> > ship with your project.  If you have several independent APIs
this
> > > could
> > > >> > mean that they are versioned separately from each other, and
from
> > the
> > > >> > overall project versioning mechanism.
> > > >> > >
> > > >> > > For example, the .NET Core library ships with a number of
> > binaries,
> > > >> each
> > > >> > with their own SemVerioned APIs.  They also have a high-level,
> easy
> > to
> > > >> > understand version for the package as a whole:
> > > >> > https://docs.microsoft.com/en-us/dotnet/core/versions/.
> > > >> > >
> > > >> > > Nodesource has a good description of this:
> > > >> http://nodesource.com/blog/
> > > >> > semver-a-primer/
> > > >> > > “Semver is a scheme for interface versioning for the benefit
of
> > > >> > interface consumers, thus if a tool has multiple interfaces,
e.g.
> an
> > > API
> > > >> > and a CLI, these interfaces may evolve independent versioning.”
> > > >> > >
> > > >> > > SemVer at its core is a communication mechanism to inform
> > developers
> > > >> of
> > > >> > incompatibilities. In this case the Scala interface is clearly
a
> > > >> separate
> > > >> > entity from the C API.  I.e. changing the Scala namespace isn’t
> > going
> > > to
> > > >> > break C API users.  It does not communicate anything useful to
> these
> > > >> users
> > > >> > if we up their major version in response to a Scala change, it
> > simply
> > > >> > breaks compatibility.  If we group all interfaces together, and
> > > >> increment
> > > >> > whenever any of them has a breaking change we’ll soon be at
MXNet
> > > >> version
> > > >> > 587.  We’ll be forcing our users to check compatibility and
update
> > > their
> > > >> > dependency tracking constantly.  The end result is that our users
> > will
> > > >> stop
> > > >> > pulling in new versions of the library.
> > > >> > >
> > > >> > > What I would propose is that (1) we have a high-level SemVer
> > system
> > > >> that
> > > >> > tracks our C_API.  This is the ‘MXNet’ version that we generally
> > refer
> > > >> to
> > > >> > and emphasize for our public releases.  For each API we have
an
> > > >> independent
> > > >> > versioning system that if we can, we fix to the MXNet version.
> When
> > > it
> > > >> > makes sense we version these APIs independently.  So for example
> we
> > > >> could
> > > >> > have a MXNet 1.2 release that ships with a 2.0 Scala API / R
API.
> > > >> > >
> > > >> > > In terms of Apache process I think shipping artifacts with
a
> > > >> non-Apache
> > > >> > namespace is a bigger issue than whatever versioning conventions
> we
> > > >> decide
> > > >> > to use.
> > > >> > >
> > > >> > > -Kellen
> > > >> > >
> > > >> > > From: Carin Meier
> > > >> > > Sent: Saturday, March 10, 2018 1:41 PM
> > > >> > > To: dev@mxnet.incubator.apache.org
> > > >> > > Cc: dev@mxnet.apache.org
> > > >> > > Subject: Re: Publishing Scala Package/namespace change
> > > >> > >
> > > >> > > +1 as well. I'm actively developing a Clojure package for
MXNet
> > that
> > > >> uses
> > > >> > > the jars from the Scala package.
> > > >> > >
> > > >> > > - Carin
> > > >> > >
> > > >> > > On Fri, Mar 9, 2018 at 4:44 PM, YiZhi Liu <eazhi.liu@gmail.com>
> > > >> wrote:
> > > >> > >
> > > >> > >> +1 for changing the namespace asap. for the maven deploy,
we
> can
> > > have
> > > >> > >> it build along with pip deployment.
> > > >> > >>
> > > >> > >>
> > > >> > >> 2018-03-09 10:15 GMT-08:00 Naveen Swamy <mnnaveen@gmail.com>:
> > > >> > >> > Hi Guys,
> > > >> > >> >
> > > >> > >> > I am working on MXNet Scala Inference APIs
> > > >> > >> > <https://issues.apache.org/jira/browse/MXNET-50>
along with
> > > >> another
> > > >> > >> > contributor Roshani. A while back I noticed that
we haven't
> > been
> > > >> > >> publishing
> > > >> > >> > the scala package to Maven for a while now(last
one being
> > > v0.11.1a
> > > >> > under
> > > >> > >> > the dmlc namespace).
> > > >> > >> > Currently users have to build the package manually
and then
> use
> > > it,
> > > >> > this
> > > >> > >> > hinders adoption and also is painful to build everything
from
> > > >> source.
> > > >> > >> >
> > > >> > >> > I also see that we haven't changed the namespace
to
> org.apache
> > > and
> > > >> > >> instead
> > > >> > >> > are still ml.dmlc namespace.
> > > >> > >> >
> > > >> > >> > I wanted to seek your opinion about changing the
MXNet-Scala
> > > >> package
> > > >> > >> > namespace to org.apache for the Scala package and
publish to
> > > Maven
> > > >> in
> > > >> > the
> > > >> > >> > upcoming release. I understand that this probably
breaks the
> > > Semver
> > > >> > >> > semantics that is agreed upon, However I would
like to point
> > out
> > > >> that
> > > >> > the
> > > >> > >> > Scala package has never been published to maven
as 1.0 under
> > > >> > org.apache.
> > > >> > >> >
> > > >> > >> > Open to suggestions.
> > > >> > >> >
> > > >> > >> > Thanks, Naveen
> > > >> > >>
> > > >> > >>
> > > >> > >>
> > > >> > >> --
> > > >> > >> Yizhi Liu
> > > >> > >> DMLC member
> > > >> > >> Amazon Web Services
> > > >> > >> Vancouver, Canada
> > > >> > >>
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > Yizhi Liu
> > > >> > DMLC member
> > > >> > Amazon Web Services
> > > >> > Vancouver, Canada
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>
>
>
> --
> Yizhi Liu
> DMLC member
> Amazon Web Services
> Vancouver, Canada
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message