mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From YiZhi Liu <eazhi....@gmail.com>
Subject Re: Publishing Scala Package/namespace change
Date Sun, 11 Mar 2018 21:54:07 GMT
And just to make clear why I raise Spark MLLib as an example. I was arguing
that namespace changing is a very rare issue that we have to make APIs
incompatible. For other situations, the APIs evolves smoothly, MLLib
'@deprecated' 'train(RDD)', instead of removing it, and make APIs
compatible for a long time.

2018-03-11 14:26 GMT-07:00 Chris Olivier <cjolivier01@gmail.com>:

> Since someone mentioned ABI, keep in mind that API compatibility does not
> necessarily mean ABI compatibility. libpython, for example, may, within a
> major version, guarantee backwards API compatibility (you can still compile
> successfully), but  does not guarantee ABI compatibility, as structure
> sizes may change, for example.
>
> On Sun, Mar 11, 2018 at 1:56 PM kellen sunderland <
> kellen.sunderland@gmail.com> wrote:
>
> > "Did ML lib increase their major version after deprecating RDD?"
> >
> > Answering my own question.  They will increase major version after RDD is
> > removed.  This is basically scenario 1 from above.  It would mean we
> > release MXNet 2.0 with the Scala changes.
> >
> > On Sun, Mar 11, 2018 at 9:54 PM, kellen sunderland <
> > kellen.sunderland@gmail.com> wrote:
> >
> > > "Why refactoring and deprecating means separating version from mxnet
> > > core?  Apache Spark MLLib refactors and deprecates a lot (e.g., they
> > > deprecates RDD API), our C API also deprecates things, remember there
> > are a
> > > bunch of xxxEx in c_api.h?"
> > >
> > > Did ML lib increase their major version after deprecating RDD?
> > >
> > > "They will. Scala API runs auto code-generation to extract Symbol
> method
> > > from MXNet core. For example, users can write and compile
> > > Symbol.NewOperator with one Scala API version, but they cannot run it
> > with
> > > an mxnet-core .so which does not have NewOperator / or have NewOperator
> > > with different args."
> > >
> > > Not sure I fully understand the scenario you're describing here.  Is
> this
> > > the case where a user writes a new operator against one version of
> > > libmxnet.so and then runs it on an older version?  In this case they'd
> > need
> > > to set a dependency on the current libmxnet.so ABI that they're running
> > > against, and ensure that their jar was using that version or newer.
> This
> > > is the goal of SemVer per interface.
> > >
> > > "By doing major version change to Scala API, we remind users 'hey, be
> > > careful, we have something incompatible!' But then what?"
> > > They either choose to update their package and then fix potential
> > breaking
> > > API changes (the likely case), or they stick with the current version.
> > >
> > > "Users get more confused with the version mapping. And it introduces
> > > overhead to maintain."
> > > I'm not sure why users even need to know about the version mapping.  If
> > > I'm only interested in the Scala package from maven, why do I care
> which
> > > version of libmxnet.so I'm using?
> > >
> > >
> > >
> > > On Sun, Mar 11, 2018 at 8:06 PM, YiZhi Liu <eazhi.liu@gmail.com>
> wrote:
> > >
> > >> >
> > >> > Changing namespaces is one example of a required major version
> change,
> > >> but
> > >> > there are more reasons like general refactoring or some deprecated
> > APIs
> > >> > just being hard to maintain.
> > >>
> > >> Why refactoring and deprecating means separating version from mxnet
> > core?
> > >> Apache Spark MLLib refactors and deprecates a lot (e.g., they
> deprecates
> > >> RDD API), our C API also deprecates things, remember there are a bunch
> > of
> > >> xxxEx in c_api.h?
> > >>
> > >> They won't get a strange error, assuming we're talking about Scala
> users
> > >> > who are upgrading from a package with the same namespace they will
> > rely
> > >> on
> > >> > the package manager to give them an update which should be painless.
> > >>
> > >> They will. Scala API runs auto code-generation to extract Symbol
> method
> > >> from MXNet core. For example, users can write and compile
> > >> Symbol.NewOperator with one Scala API version, but they cannot run it
> > with
> > >> an mxnet-core .so which does not have NewOperator / or have
> NewOperator
> > >> with different args.
> > >>
> > >> By doing major version change to Scala API, we remind users 'hey, be
> > >> careful, we have something incompatible!' But then what? Users get
> more
> > >> confused with the version mapping. And it introduces overhead to
> > maintain.
> > >>
> > >> @Chris, I think we can have two separate votes.
> > >>
> > >>
> > >> 2018-03-11 9:19 GMT-07:00 Chris Olivier <cjolivier01@gmail.com>:
> > >>
> > >> > Ok, so why don’t we have two votes?
> > >> >
> > >> > 1) change namespace is a separate vote since it’s a code change
and
> > has
> > >> > different voting rules (can be vetoed)
> > >> >
> > >> > 2) whether to disconnect non-C-API versioning from C-API versioning
> > and
> > >> > have parallel versioning of all non-C APIs (process rule, so
> > majority, I
> > >> > think is the rule, right?)
> > >> >
> > >> > -Chris
> > >> >
> > >> > On Sun, Mar 11, 2018 at 8:46 AM kellen sunderland <
> > >> > kellen.sunderland@gmail.com> wrote:
> > >> >
> > >> > > Sorry, the namespace should have been 'org.apache.mxnet' with
the
> > >> > artifact
> > >> > > as 'mxnet-incubating'.
> > >> > >
> > >> > > On Sun, Mar 11, 2018 at 4:44 PM, kellen sunderland <
> > >> > > kellen.sunderland@gmail.com> wrote:
> > >> > >
> > >> > > > YiZhi, In general I agree that your points and examples
are the
> > >> ideal
> > >> > > > case, but in the MXNet situation there are some trade-offs
we
> have
> > >> to
> > >> > > > make.  Let me try to specifically answer your points:
> > >> > > >
> > >> > > > "Do you mean we have different version for 'ml.dmlc' namespace
> and
> > >> > > > 'org.apache' namespace?"
> > >> > > > No I am not trying to saying that. I believe Marco, Naveen
and I
> > are
> > >> > all
> > >> > > > proposing we use a single org.apache.incubating.mxnet namespace
> > >> moving
> > >> > > > forward, which would require a major version change to our
> product
> > >> API
> > >> > > > under our current versioning scheme.  Marco and I are proposing
> we
> > >> > apply
> > >> > > > this MV change _only_ to the scala package's API.
> > >> > > >
> > >> > > > "How to tell which Scala API version works with which MXNet
core
> > >> > version?
> > >> > > > By document?"
> > >> > > > Yes users will be able to tell via the website, release
docs,
> > maven
> > >> > > > package information, pom file, etc.
> > >> > > >
> > >> > > > "How many users will read the whole document and carefully
pair
> > the
> > >> > > > version id before they run into a strange error and give
up?"
> > >> > > > They won't get a strange error, assuming we're talking about
> Scala
> > >> > users
> > >> > > > who are upgrading from a package with the same namespace
they
> will
> > >> rely
> > >> > > on
> > >> > > > the package manager to give them an update which should
be
> > painless.
> > >> > > >
> > >> > > > Secondly software developers understand that packages, not
> > products,
> > >> > have
> > >> > > > versions.  They know that these versions are used to communicate
> > >> when
> > >> > > APIs
> > >> > > > are broken.  There's examples of Apache packages doing this
for
> > >> > packages
> > >> > > > that include multiple interfaces, for example first-party
> modules
> > >> > > packaged
> > >> > > > with the HTTP server, or log4j's language bindings (arguably
> quite
> > >> > > similar
> > >> > > > to what Naveen is doing).
> > >> > > >
> > >> > > > While we can debate the right way to version packages, I
think
> > >> there's
> > >> > a
> > >> > > > clear community decision here to get Naveen unblocked:
> > >> > > >
> > >> > > > (1) We continue semantically versioning across all APIs,
meaning
> > >> that
> > >> > > this
> > >> > > > change would get released with MXNet 2.*.
> > >> > > > (2) You version package interfaces semantically and have
a
> > >> compatible
> > >> > > > version mapping.
> > >> > > > (3) Status quo, we continue to release a Scala package as-is,
> > >> breaking
> > >> > > > apache guidelines for artifact generation.
> > >> > > > (4) We rely on the namespace change itself to communicate
a
> change
> > >> in
> > >> > the
> > >> > > > interface.  We don't consider this a major change.
> > >> > > >
> > >> > > > My (non-binding) preference would be for option 2.
> > >> > > >
> > >> > > > -Kellen
> > >> > > >
> > >> > > > On Sun, Mar 11, 2018 at 12:44 PM, Marco de Abreu <
> > >> > > > marco.g.abreu@googlemail.com> wrote:
> > >> > > >
> > >> > > >> Changing namespaces is one example of a required major
version
> > >> change,
> > >> > > but
> > >> > > >> there are more reasons like general refactoring or some
> > deprecated
> > >> > APIs
> > >> > > >> just being hard to maintain. Things like these happen
quite
> > >> frequently
> > >> > > and
> > >> > > >> it's a problem every software project has to face and
find a
> > >> solution
> > >> > > for.
> > >> > > >>
> > >> > > >> Regarding ' How to tell which Scala API version works
with
> which
> > >> MXNet
> > >> > > >> core
> > >> > > >> version?': We could just bundle MXNet with the released
API
> > >> package as
> > >> > > we
> > >> > > >> do right now, but we would give each interface it's
own version
> > and
> > >> > > >> publish
> > >> > > >> them on their distribution platforms accordingly. Just
an
> > example:
> > >> > > >> >Scala-Package -> MXNet-Version
> > >> > > >> >> 1.0 -> 1.0
> > >> > > >> >> 1.1 -> 1.1
> > >> > > >> >> 2.0 -> 1.2
> > >> > > >> >> 2.1 -> 1.3
> > >> > > >> >> 3.0 -> 2.0
> > >> > > >>
> > >> > > >> > R-Package -> MXNet-Version
> > >> > > >> >> 1.0 -> 1.0
> > >> > > >> >> 2.0 -> 1.1
> > >> > > >> >> 2.1 -> 1.2
> > >> > > >> >> 2.2 -> 1.3
> > >> > > >> >> 3.0 -> 2.0
> > >> > > >>
> > >> > > >> This is always an N-to-1 mapping, while N being the
versions of
> > our
> > >> > APIs
> > >> > > >> and 1 the MXNet Core version. From MXNets versioning
> perspective,
> > >> this
> > >> > > >> would then looking the following:
> > >> > > >> > MXNet-Version -> APIs
> > >> > > >> >> 1.0 -> Scala_1.0; R_1.0
> > >> > > >> >> 1.1 -> Scala_1.1; R_2.0
> > >> > > >> >> 1.2 -> Scala_2.0; R_2.1
> > >> > > >> >> 1.3 -> Scala_2.1; R_2.2
> > >> > > >> >> 2.0 -> Scala_3.0; R_3.0
> > >> > > >>
> > >> > > >> This would give us the liberty to develop MXNet without
> > >> restricting us
> > >> > > too
> > >> > > >> much - of course, major version increments will still
have to
> be
> > >> > > >> considered
> > >> > > >> carefully. I don't think that this would harm transparency
too
> > much
> > >> > and
> > >> > > >> there's no need to write big documentation.
> > >> > > >>
> > >> > > >> -Marco
> > >> > > >>
> > >> > > >>
> > >> > > >> On Sun, Mar 11, 2018 at 12:16 PM, YiZhi Liu <
> liuyizhi@apache.org
> > >
> > >> > > wrote:
> > >> > > >>
> > >> > > >> > I have no idea how separating Scala API version
can solve the
> > >> > > >> > 'compatibility' problem. Do you mean we have different
> version
> > >> for
> > >> > > >> > 'ml.dmlc' namespace and 'org.apache' namespace?
Do these two
> > >> > versions
> > >> > > >> > have same behavior? How to tell which Scala API
version works
> > >> with
> > >> > > >> > which MXNet core version? By document? How many
users will
> read
> > >> the
> > >> > > >> > whole document and carefully pair the version id
before they
> > run
> > >> > into
> > >> > > >> > a strange error and give up?
> > >> > > >> >
> > >> > > >> > Moreover, changing namespace is an issue that is
really rare
> > and
> > >> > > >> > hardly happens. For other 'compatibility' problem,
for
> example,
> > >> the
> > >> > > >> > class/function definitions, should handle the compatibility
> > >> itself.
> > >> > > >> > You'll never expect a project to have a different
version for
> > >> > changing
> > >> > > >> > 'calculate(int)' to 'calculate(float)', it should
just add a
> > new
> > >> > > >> > function 'calculate(float)'.
> > >> > > >> >
> > >> > > >> > Regarding 'In this case the Scala interface is
clearly a
> > separate
> > >> > > >> > entity from the C API.'. Everything can be seen
as a separate
> > >> > entity,
> > >> > > >> > the mxnet engine, the graph description, operators,
python
> API,
> > >> > gluon
> > >> > > >> > API, etc. We should think carefully what we want
to provide,
> > and
> > >> > what
> > >> > > >> > our users need.
> > >> > > >> >
> > >> > > >> > As an example, Apache Spark, still has SparkR (R
API),
> PySpark
> > >> > (Python
> > >> > > >> > API), MLLib, GraphX ... as part of its release,
and have the
> > same
> > >> > > >> > version as Spark core as well as its Scala/Java
API.
> > >> > > >> >
> > >> > > >> > 2018-03-10 23:58 GMT-08:00 kellen sunderland <
> > >> > > >> kellen.sunderland@gmail.com
> > >> > > >> > >:
> > >> > > >> > > +1 (non-binding) to what Marco is describing.
 +1
> > >> (non-binding) to
> > >> > > >> > getting the Scala bindings with the namespace change
into
> > Maven.
> > >> > > >> > >
> > >> > > >> > > The general best practice for SemVer, which
is used by most
> > >> > projects
> > >> > > >> > that employ SemVer, is to apply SemVer to the public
APIs of
> > >> > packages
> > >> > > >> that
> > >> > > >> > ship with your project.  If you have several independent
APIs
> > >> this
> > >> > > could
> > >> > > >> > mean that they are versioned separately from each
other, and
> > from
> > >> > the
> > >> > > >> > overall project versioning mechanism.
> > >> > > >> > >
> > >> > > >> > > For example, the .NET Core library ships with
a number of
> > >> > binaries,
> > >> > > >> each
> > >> > > >> > with their own SemVerioned APIs.  They also have
a
> high-level,
> > >> easy
> > >> > to
> > >> > > >> > understand version for the package as a whole:
> > >> > > >> > https://docs.microsoft.com/en-us/dotnet/core/versions/.
> > >> > > >> > >
> > >> > > >> > > Nodesource has a good description of this:
> > >> > > >> http://nodesource.com/blog/
> > >> > > >> > semver-a-primer/
> > >> > > >> > > “Semver is a scheme for interface versioning
for the
> benefit
> > of
> > >> > > >> > interface consumers, thus if a tool has multiple
interfaces,
> > >> e.g. an
> > >> > > API
> > >> > > >> > and a CLI, these interfaces may evolve independent
> versioning.”
> > >> > > >> > >
> > >> > > >> > > SemVer at its core is a communication mechanism
to inform
> > >> > developers
> > >> > > >> of
> > >> > > >> > incompatibilities. In this case the Scala interface
is
> clearly
> > a
> > >> > > >> separate
> > >> > > >> > entity from the C API.  I.e. changing the Scala
namespace
> isn’t
> > >> > going
> > >> > > to
> > >> > > >> > break C API users.  It does not communicate anything
useful
> to
> > >> these
> > >> > > >> users
> > >> > > >> > if we up their major version in response to a Scala
change,
> it
> > >> > simply
> > >> > > >> > breaks compatibility.  If we group all interfaces
together,
> and
> > >> > > >> increment
> > >> > > >> > whenever any of them has a breaking change we’ll
soon be at
> > MXNet
> > >> > > >> version
> > >> > > >> > 587.  We’ll be forcing our users to check compatibility
and
> > >> update
> > >> > > their
> > >> > > >> > dependency tracking constantly.  The end result
is that our
> > users
> > >> > will
> > >> > > >> stop
> > >> > > >> > pulling in new versions of the library.
> > >> > > >> > >
> > >> > > >> > > What I would propose is that (1) we have a
high-level
> SemVer
> > >> > system
> > >> > > >> that
> > >> > > >> > tracks our C_API.  This is the ‘MXNet’ version
that we
> > generally
> > >> > refer
> > >> > > >> to
> > >> > > >> > and emphasize for our public releases.  For each
API we have
> an
> > >> > > >> independent
> > >> > > >> > versioning system that if we can, we fix to the
MXNet
> version.
> > >> When
> > >> > > it
> > >> > > >> > makes sense we version these APIs independently.
 So for
> > example
> > >> we
> > >> > > >> could
> > >> > > >> > have a MXNet 1.2 release that ships with a 2.0
Scala API / R
> > API.
> > >> > > >> > >
> > >> > > >> > > In terms of Apache process I think shipping
artifacts with
> a
> > >> > > >> non-Apache
> > >> > > >> > namespace is a bigger issue than whatever versioning
> > conventions
> > >> we
> > >> > > >> decide
> > >> > > >> > to use.
> > >> > > >> > >
> > >> > > >> > > -Kellen
> > >> > > >> > >
> > >> > > >> > > From: Carin Meier
> > >> > > >> > > Sent: Saturday, March 10, 2018 1:41 PM
> > >> > > >> > > To: dev@mxnet.incubator.apache.org
> > >> > > >> > > Cc: dev@mxnet.apache.org
> > >> > > >> > > Subject: Re: Publishing Scala Package/namespace
change
> > >> > > >> > >
> > >> > > >> > > +1 as well. I'm actively developing a Clojure
package for
> > MXNet
> > >> > that
> > >> > > >> uses
> > >> > > >> > > the jars from the Scala package.
> > >> > > >> > >
> > >> > > >> > > - Carin
> > >> > > >> > >
> > >> > > >> > > On Fri, Mar 9, 2018 at 4:44 PM, YiZhi Liu
<
> > eazhi.liu@gmail.com
> > >> >
> > >> > > >> wrote:
> > >> > > >> > >
> > >> > > >> > >> +1 for changing the namespace asap. for
the maven deploy,
> we
> > >> can
> > >> > > have
> > >> > > >> > >> it build along with pip deployment.
> > >> > > >> > >>
> > >> > > >> > >>
> > >> > > >> > >> 2018-03-09 10:15 GMT-08:00 Naveen Swamy
<
> mnnaveen@gmail.com
> > >:
> > >> > > >> > >> > Hi Guys,
> > >> > > >> > >> >
> > >> > > >> > >> > I am working on MXNet Scala Inference
APIs
> > >> > > >> > >> > <https://issues.apache.org/jira/browse/MXNET-50>
along
> > with
> > >> > > >> another
> > >> > > >> > >> > contributor Roshani. A while back
I noticed that we
> > haven't
> > >> > been
> > >> > > >> > >> publishing
> > >> > > >> > >> > the scala package to Maven for a
while now(last one
> being
> > >> > > v0.11.1a
> > >> > > >> > under
> > >> > > >> > >> > the dmlc namespace).
> > >> > > >> > >> > Currently users have to build the
package manually and
> > then
> > >> use
> > >> > > it,
> > >> > > >> > this
> > >> > > >> > >> > hinders adoption and also is painful
to build everything
> > >> from
> > >> > > >> source.
> > >> > > >> > >> >
> > >> > > >> > >> > I also see that we haven't changed
the namespace to
> > >> org.apache
> > >> > > and
> > >> > > >> > >> instead
> > >> > > >> > >> > are still ml.dmlc namespace.
> > >> > > >> > >> >
> > >> > > >> > >> > I wanted to seek your opinion about
changing the
> > MXNet-Scala
> > >> > > >> package
> > >> > > >> > >> > namespace to org.apache for the Scala
package and
> publish
> > to
> > >> > > Maven
> > >> > > >> in
> > >> > > >> > the
> > >> > > >> > >> > upcoming release. I understand that
this probably breaks
> > the
> > >> > > Semver
> > >> > > >> > >> > semantics that is agreed upon, However
I would like to
> > point
> > >> > out
> > >> > > >> that
> > >> > > >> > the
> > >> > > >> > >> > Scala package has never been published
to maven as 1.0
> > under
> > >> > > >> > org.apache.
> > >> > > >> > >> >
> > >> > > >> > >> > Open to suggestions.
> > >> > > >> > >> >
> > >> > > >> > >> > Thanks, Naveen
> > >> > > >> > >>
> > >> > > >> > >>
> > >> > > >> > >>
> > >> > > >> > >> --
> > >> > > >> > >> Yizhi Liu
> > >> > > >> > >> DMLC member
> > >> > > >> > >> Amazon Web Services
> > >> > > >> > >> Vancouver, Canada
> > >> > > >> > >>
> > >> > > >> > >
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >> > --
> > >> > > >> > Yizhi Liu
> > >> > > >> > DMLC member
> > >> > > >> > Amazon Web Services
> > >> > > >> > Vancouver, Canada
> > >> > > >> >
> > >> > > >>
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Yizhi Liu
> > >> DMLC member
> > >> Amazon Web Services
> > >> Vancouver, Canada
> > >>
> > >
> > >
> >
>



-- 
Yizhi Liu
DMLC member
Amazon Web Services
Vancouver, Canada

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message