spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: Spark Improvement Proposals
Date Fri, 07 Oct 2016 17:50:53 GMT
For the improvement proposals, I think one major point was to make them really visible to users
who are not contributors, so we should do more than sending stuff to dev@. One very lightweight
idea is to have a new type of JIRA called a SIP and have a link to a filter that shows all
such JIRAs from I also like the idea of SIP and design doc templates
(in fact many projects have them).


> On Oct 7, 2016, at 10:38 AM, Reynold Xin <> wrote:
> I called Cody last night and talked about some of the topics in his email. It became
clear to me Cody genuinely cares about the project.
> Some of the frustrations come from the success of the project itself becoming very "hot",
and it is difficult to get clarity from people who don't dedicate all their time to Spark.
In fact, it is in some ways similar to scaling an engineering team in a successful startup:
old processes that worked well might not work so well when it gets to a certain size, cultures
can get diluted, building culture vs building process, etc.
> I also really like to have a more visible process for larger changes, especially major
user facing API changes. Historically we upload design docs for major changes, but it is not
always consistent and difficult to quality of the docs, due to the volunteering nature of
the organization.
> Some of the more concrete ideas we discussed focus on building a culture to improve clarity:
> - Process: Large changes should have design docs posted on JIRA. One thing Cody and I
didn't discuss but an idea that just came to me is we should create a design doc template
for the project and ask everybody to follow. The design doc template should also explicitly
list goals and non-goals, to make design doc more consistent.
> - Process: Email dev@ to solicit feedback. We have some this with some changes, but again
very inconsistent. Just posting something on JIRA isn't sufficient, because there are simply
too many JIRAs and the signal get lost in the noise. While this is generally impossible to
enforce because we can't force all volunteers to conform to a process (or they might not even
be aware of this),  those who are more familiar with the project can help by emailing the
dev@ when they see something that hasn't been.
> - Culture: The design doc author(s) should be open to feedback. A design doc should serve
as the base for discussion and is by no means the final design. Of course, this does not mean
the author has to accept every feedback. They should also be comfortable accepting / rejecting
ideas on technical grounds.
> - Process / Culture: For major ongoing projects, it can be useful to have some monthly
Google hangouts that are open to the world. I am actually not sure how well this will work,
because of the volunteering nature and we need to adjust for timezones for people across the
globe, but it seems worth trying.
> - Culture: Contributors (including committers) should be more direct in setting expectations,
including whether they are working on a specific issue, whether they will be working on a
specific issue, and whether an issue or pr or jira should be rejected. Most people I know
in this community are nice and don't enjoy telling other people no, but it is often more annoying
to a contributor to not know anything than getting a no.
> On Fri, Oct 7, 2016 at 10:03 AM, Matei Zaharia < <>>
> Love the idea of a more visible "Spark Improvement Proposal" process that solicits user
input on new APIs. For what it's worth, I don't think committers are trying to minimize their
own work -- every committer cares about making the software useful for users. However, it
is always hard to get user input and so it helps to have this kind of process. I've certainly
looked at the *IPs a lot in other software I use just to see the biggest things on the roadmap.
> When you're talking about "changing interfaces", are you talking about public or internal
APIs? I do think many people hate changing public APIs and I actually think that's for the
best of the project. That's a technical debate, but basically, the worst thing when you're
using a piece of software is that the developers constantly ask you to rewrite your app to
update to a new version (and thus benefit from bug fixes, etc). Cue anyone who's used Protobuf,
or Guava. The "let's get everyone to change their code this release" model works well within
a single large company, but doesn't work well for a community, which is why nearly all *very*
widely used programming interfaces (I'm talking things like Java standard library, Windows
API, etc) almost *never* break backwards compatibility. All this is done within reason though,
e.g. we do change things in major releases (2.x, 3.x, etc).

View raw message