Mailing-List: contact dev-help@spark.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@spark.incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of matthew.c.cheah@gmail.com
 designates 209.85.223.177 as permitted sender)
MIME-Version: 1.0
Date: Tue, 17 Dec 2013 11:30:45 -0800
Message-ID: 
 <CAHH8_OPG0jT9_1mPZQrwdhY4gfiQ7-gpwZFVHqz8hiUyfJ=3uA@mail.gmail.com>
Subject: Spark development for undergraduate project
From: Matthew Cheah <matthew.c.cheah@gmail.com>
To: dev@spark.incubator.apache.org
Content-Type: multipart/alternative; boundary=089e0111bb740b160e04edbff71d

--089e0111bb740b160e04edbff71d
Content-Type: text/plain; charset=ISO-8859-1

Hi everyone,

During my most recent internship, I worked extensively with Apache Spark,
integrating it into a company's data analytics platform. I've now become
interested in contributing to Apache Spark.

I'm returning to undergraduate studies in January and there is an academic
course which is simply a standalone software engineering project. I was
thinking that some contribution to Apache Spark would satisfy my curiosity,
help continue support the company I interned at, and give me academic
credits required to graduate, all at the same time. It seems like too good
an opportunity to pass up.

With that in mind, I have the following questions:

   1. At this point, is there any self-contained project that I could work
   on within Spark? Ideally, I would work on it independently, in about a
   three month time frame. This time also needs to accommodate ramping up on
   the Spark codebase and adjusting to the Scala programming language and
   paradigms. The company I worked at primarily used the Java APIs. The output
   needs to be a technical report describing the project requirements, and the
   design process I took to engineer the solution for the requirements. In
   particular, it cannot just be a series of haphazard patches.
   2. How can I get started with contributing to Spark?
   3. Is there a high-level UML or some other design specification for the
   Spark architecture?

Thanks! I hope to be of some help =)

-Matt Cheah

--089e0111bb740b160e04edbff71d--