reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saikat Kanjilal <>
Subject Re: The plan for reef-1791
Date Fri, 20 Oct 2017 22:40:56 GMT
Hello Folks,
I wanted to give a quick end of the week status on reef-1791, here's what I
have working so far:
* Successfully launching the LineCounter program using the DataLoader
architecture against the spark runtime for a local file running against a 1
node hadoop yarn install
* Successfully invoking the flatmap function and having reef launcher run
inside of that against all the predefined partitions

* Some code cleanup before I submit a PR, no unit tests yet , will add
during the the time we're flushing out the PR
* Documentation around the chosen architecture

What are the major changes:
* Addition of a new runtime called reef-runtime-spark which invokes the
sparkcontext and launches reef within that cotnext through a simple flatmap
function for now
* Had to change all the Reef Configuration relation classes
(JavaConfigurationBuilderImpl) to implement the Serializable interface as
each closure in spark requires that all the classes passed inside them have
to be serializable, I am wondering about the impact of this (including
performance impact) against the rest of the reef codebase

Please let me know if there are any questions or additional feedback, look
for the cr hopefully in the next week or so.
Thanks in advance.

On Tue, Oct 10, 2017 at 9:20 AM, Saikat Kanjilal <> wrote:

> Good morning Reef dev community
> I wanted to share some thoughts on how I am thinking we move forward with
> the implementation of reef-runtime-spark:
>    1. I have completed my first cut of the code based on discussions with
>    Sergiy and am ready to test this code and will do so both locally and
>    either on hdinsight or on a vm installed with spark and Hadoop running on
>    yarn
>    2. Testing will take a bit of time  as we need to work out all the
>    bugs that come up coordinating events with reef and spark containers
>    3. Next week I will be testing this on my mac running spark binaries
>    on Hadoop locally
>    4.  Towards the end of the month I will transition to testing on AWS ,
>    specifically running spark on EMR and reef on that setup, I think running
>    REEF on AWS/EMR is a big plus and will enable more users to run spark omn
>    REF
>    5. I was going to wait to put out a code review till the first
>    successful tests go through , to reiterate the goal for the first phase  is
>    to simply run HelloReef on spark
> If you have any concerns or feedback on this plan  do let me know, as I
> mentioned in JIRA I would really like to see us move to Java8 sooner than
> latter, it’ll make the development of reef-runtime-spark a lot simpler.
> Thanks in advance for your help.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message