I would recommend to start an investigation if we could support both the 1.x and 2.x lines with a single code base. It seems feasible to refactor the code a bit, compile against 2.0 (or with profiles), and run on either 1.6 or 2.0. For example, by creating a wrapper that implements both Iterable and Iterator, we could overcome the Iterator API change as shown by our LazyIterableIterator which did not require any change in related functions. Btw, we did the same for MRv1 and Yarn by ensuring that on MRv1, we don't touch Yarn related APIs. Similarly on Spark, we already support both legacy and >=1.6 memory management. I think this kind of platform independence is really valuable but it obviously adds complexity.

Regards,
Matthias


Inactive hide details for Niketan Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more release against Spark 1.6Niketan Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more release against Spark 1.6. Since default scala version for Spark 1.

From: Niketan Pansare/Almaden/IBM@IBMUS
To: dev@systemml.incubator.apache.org
Date: 08/03/2016 05:15 PM
Subject: Re: [DISCUSS] Migration to Spark 2.0.0





I am in favor of having one more release against Spark 1.6. Since default scala version for Spark 1.6 is 2.10, I recommend either having SystemML compiled and released with Scala 2.10 profile or having two release candidates.

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com

http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar

Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting onto Spark 2.0 quickly ought to be a priority, there are existing early u

From:
Frederick R Reiss/Almaden/IBM@IBMUS
To:
dev@systemml.incubator.apache.org
Date:
08/03/2016 03:58 PM
Subject:
Re: [DISCUSS] Migration to Spark 2.0.0




While I agree that getting onto Spark 2.0 quickly ought to be a priority, there are existing early users of SystemML who are likely stuck on Spark 1.6.x for the next few months. Those users could want some of the new experimental features since 0.10 (specifically frames, the prototype Python DSL, and the new MLContext) and it would be good to have a Spark 1.6 branch of our version tree where we can backport the debugged versions of these features if needed.

I would recommend that we do one more SystemML release against Spark 1.6, then switch the head version of SystemML over to Spark 2.0, then immediately perform a second SystemML release. Thoughts?

Fred

Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in favor of moving to Spark 2.0 as early as possible. This will allow SystemML


From:
Deron Eriksson <deroneriksson@gmail.com>
To:
dev@systemml.incubator.apache.org
Date:
08/02/2016 12:13 PM
Subject:
Re: [DISCUSS] Migration to Spark 2.0.0




I would definitely be in favor of moving to Spark 2.0 as early as possible.
This will allow SystemML to be current with cutting edge Spark. It would be
nice to focus our efforts on the latest Spark.

Deron


On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com> wrote:

> I'm in favor of moving to Spark 2.0 now, meaning that our upcoming release
> would include both new features and 2.0 support.  0.10 has plenty of
> functionality for any existing 1.x users.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com> wrote:
> >
> >
> >
> > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support and
> > API updates such as new MLContext were identified as main new features
> for
> > the release.  In addition, support for Spark 2.0.0 was targeted.
> > Note code changes required for Spark 2.0.0 are not backward compatible to
> > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread for
> > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> >
> > One possible option is to do a release to include the new Apache SystemML
> > features before migrating to Spark 2.0.0.  However, it seems better to
> have
> > the next Apache SystemML release compatible with latest Spark version
> > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> earlier
> > versions of Spark.
> >
> > Regards,
> > Glenn
>