spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: renaming "minor release" to "feature release"
Date Thu, 28 Jul 2016 23:20:03 GMT
I also agree with this given the way we develop stuff. We don't really want to move to possibly-API-breaking
major releases super often, but we do have lots of large features that come out all the time,
and our current name doesn't convey that.

Matei

> On Jul 28, 2016, at 4:15 PM, Reynold Xin <rxin@databricks.com> wrote:
> 
> Yea definitely. Those are consistent with what is defined here: https://cwiki.apache.org/confluence/display/SPARK/Spark+Versioning+Policy
<https://cwiki.apache.org/confluence/display/SPARK/Spark+Versioning+Policy>
> 
> The only change I'm proposing is replacing "minor" with "feature".
> 
> 
> On Thu, Jul 28, 2016 at 4:10 PM, Sean Owen <sowen@cloudera.com <mailto:sowen@cloudera.com>>
wrote:
> Although 'minor' is the standard term, the important thing is making
> the nature of the release understood. 'feature release' seems OK to me
> as an additional description.
> 
> Is it worth agreeing on or stating a little more about the theory?
> 
> patch release: backwards/forwards compatible within a minor release,
> generally fixes only
> minor/feature release: backwards compatible within a major release,
> not forward; generally also includes new features
> major release: not backwards compatible and may remove or change
> existing features
> 
> On Thu, Jul 28, 2016 at 3:46 PM, Reynold Xin <rxin@databricks.com <mailto:rxin@databricks.com>>
wrote:
> > tl;dr
> >
> > I would like to propose renaming “minor release” to “feature release” in
> > Apache Spark.
> >
> >
> > details
> >
> > Apache Spark’s official versioning policy follows roughly semantic
> > versioning. Each Spark release is versioned as
> > [major].[minor].[maintenance]. That is to say, 1.0.0 and 2.0.0 are both
> > “major releases”, whereas “1.1.0” and “1.3.0” would be minor releases.
> >
> > I have gotten a lot of feedback from users that the word “minor” is
> > confusing and does not accurately describes those releases. When users hear
> > the word “minor”, they think it is a small update that introduces couple
> > minor features and some bug fixes. But if you look at the history of Spark
> > 1.x, here are just a subset of large features added:
> >
> > Spark 1.1: sort-based shuffle, JDBC/ODBC server, new stats library, 2-5X
> > perf improvement for machine learning.
> >
> > Spark 1.2: HA for streaming, new network module, Python API for streaming,
> > ML pipelines, data source API.
> >
> > Spark 1.3: DataFrame API, Spark SQL graduate out of alpha, tons of new
> > algorithms in machine learning.
> >
> > Spark 1.4: SparkR, Python 3 support, DAG viz, robust joins in SQL, math
> > functions, window functions, SQL analytic functions, Python API for
> > pipelines.
> >
> > Spark 1.5: code generation, Project Tungsten
> >
> > Spark 1.6: automatic memory management, Dataset API, ML pipeline persistence
> >
> >
> > So while “minor” is an accurate depiction of the releases from an API
> > compatibiility point of view, we are miscommunicating and doing Spark a
> > disservice by calling these releases “minor”. I would actually call these
> > releases “major”, but then it would be a larger deviation from semantic
> > versioning. I think calling these “feature releases” would be a smaller
> > change and a more accurate depiction of what they are.
> >
> > That said, I’m not attached to the name “feature” and am open to
> > suggestions, as long as they don’t convey the notion of “minor”.
> >
> >
> 


Mime
View raw message