systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm" <mbo...@us.ibm.com>
Subject Re: Runtime package refactoring
Date Sat, 05 Dec 2015 23:17:06 GMT

yes, these changes are all local to 'org.apache.sysml.runtime'. Other than
binary format incompatibility, there are no other side effects for MR or
Spark. These changes are primarily a cleanup of a historically grown
package structure and a preparation step. For now, there will be still just
one assembly - down the road however, this allows us to create a separate
artifact of the core runtime library (which is already used by all three
CP/MR/Spark runtime backends) for external usage too.


Regards,
Matthias



From:	Luciano Resende <luckbr1975@gmail.com>
To:	dev@systemml.incubator.apache.org
Date:	12/05/2015 01:13 PM
Subject:	Re: Runtime package refactoring



On Fri, Dec 4, 2015 at 5:16 PM, Matthias Boehm <mboehm@us.ibm.com> wrote:

>
>
> Hi all,
>
> just a quick heads-up, I'd like to do a refactoring of our runtime
package.
> The goals are (1) to separate out all mr-related classes (cleanup), and
(2)
> to prepare our core matrix block runtime for packaging as an individual
jar
> which would make it consumable as a small-footprint library. I intend to
> make this change mid next week.
>
> Similar to the refactoring from 'com.ibm.bi.dml' to 'org.apache.sysml',
> this change would break binary compatibility with existing datasets in
> binary format because the class names are persistent in the sequence file
> headers. A workaround is to use an old jar to convert your data from the
> old binary format to text, and a new jar to convert the text
representation
> to the new binary format.
>
> Here is the proposed package structure:
>
> org.apache.sysml.runtime
> --controlprogram [...]
> --core
> ----matrix
> ----funobj
> ----operators
> --instructions [...]
> --io
>
--mapred
> ----data
> ----hadoopfix
> ----jobs
> ----tasks
> ----sort
> --parfor [...]
> --transform
> --util
>

I am assuming these changes are all under org.apache.sysml.runtime


>
> Given this structure we could simply package 'core'/'util' and perhaps
'io'
> into a separate jar.
>
>
Few Questions:

- What would be the side effects for different runtimes (MR/Spark)
integration ?
- Is this is just a local build modularization issue, and we are still
planning to generate ONE distribution assembly ?


>
> Regards,
> Matthias
>

Also, as we experienced multiple issues with the package refactoring, I
would recommend the following :

- Perform the refactor on your own fork (not on apache git)
- Move the files as one git commit
- Do all the file content changes as a second git commit (imports, docs,
javadocs, etc)
- Create a full build to make sure there is no breakages
- Let the team review to make sure we are not loosing history on the files
or something similar.

Thank you

--
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message