spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Chammas <nicholas.cham...@gmail.com>
Subject Re: Design docs: consolidation and discoverability
Date Mon, 27 Apr 2015 17:50:33 GMT
Oh, a GitHub wiki (which is separate from having docs in a repo) is yet
another approach we could take, though if we want to do that on the main
Spark repo we'd need permission from Apache, which may be tough to get...

On Mon, Apr 27, 2015 at 1:47 PM Punyashloka Biswal <punya.biswal@gmail.com>
wrote:

> Nick, I like your idea of keeping it in a separate git repository. It
> seems to combine the advantages of the present Google Docs approach with
> the crisper history, discoverability, and text format simplicity of GitHub
> wikis.
>
> Punya
> On Mon, Apr 27, 2015 at 1:30 PM Nicholas Chammas <
> nicholas.chammas@gmail.com> wrote:
>
>> I like the idea of having design docs be kept up to date and tracked in
>> git.
>>
>> If the Apache repo isn't a good fit, perhaps we can have a separate repo
>> just for design docs? Maybe something like
>> github.com/spark-docs/spark-docs/
>> ?
>>
>> If there's other stuff we want to track but haven't, perhaps we can
>> generalize the purpose of the repo a bit and rename it accordingly (e.g.
>> spark-misc/spark-misc).
>>
>> Nick
>>
>> On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza <sandy.ryza@cloudera.com>
>> wrote:
>>
>> > My only issue with Google Docs is that they're mutable, so it's
>> difficult
>> > to follow a design's history through its revisions and link up JIRA
>> > comments with the relevant version.
>> >
>> > -Sandy
>> >
>> > On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran <stevel@hortonworks.com
>> >
>> > wrote:
>> >
>> > >
>> > > One thing to consider is that while docs as PDFs in JIRAs do document
>> the
>> > > original proposal, that's not the place to keep living specifications.
>> > That
>> > > stuff needs to live in SCM, in a format which can be easily
>> maintained,
>> > can
>> > > generate readable documents, and, in an unrealistically ideal world,
>> even
>> > > be used by machines to validate compliance with the design. Test
>> suites
>> > > tend to be the implicit machine-readable part of the specification,
>> > though
>> > > they aren't usually viewed as such.
>> > >
>> > > PDFs of word docs in JIRAs are not the place for ongoing work, even if
>> > the
>> > > early drafts can contain them. Given it's just as easy to point to
>> > markdown
>> > > docs in github by commit ID, that could be an alternative way to
>> publish
>> > > docs, with the document itself being viewed as one of the
>> deliverables.
>> > > When the time comes to update a document, then its there in the source
>> > tree
>> > > to edit.
>> > >
>> > > If there's a flaw here, its that design docs are that: the design. The
>> > > implementation may not match, ongoing work will certainly diverge. If
>> the
>> > > design docs aren't kept in sync, then they can mislead people.
>> > Accordingly,
>> > > once the design docs are incorporated into the source tree, keeping
>> them
>> > in
>> > > sync with changes has be viewed as essential as keeping tests up to
>> date
>> > >
>> > > > On 26 Apr 2015, at 22:34, Patrick Wendell <pwendell@gmail.com>
>> wrote:
>> > > >
>> > > > I actually don't totally see why we can't use Google Docs provided
>> it
>> > > > is clearly discoverable from the JIRA. It was my understanding that
>> > > > many projects do this. Maybe not (?).
>> > > >
>> > > > If it's a matter of maintaining public record on ASF infrastructure,
>> > > > perhaps we can just automate that if an issue is closed we capture
>> the
>> > > > doc content and attach it to the JIRA as a PDF.
>> > > >
>> > > > My sense is that in general the ASF infrastructure policy is
>> becoming
>> > > > more and more lenient with regards to using third party services,
>> > > > provided the are broadly accessible (such as a public google doc)
>> and
>> > > > can be definitively archived on ASF controlled storage.
>> > > >
>> > > > - Patrick
>> > > >
>> > > > On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <sowen@cloudera.com>
>> wrote:
>> > > >> I know I recently used Google Docs from a JIRA, so am guilty as
>> > > >> charged. I don't think there are a lot of design docs in general,
>> but
>> > > >> the ones I've seen have simply pushed docs to a JIRA. (I did the
>> same,
>> > > >> mirroring PDFs of the Google Doc.) I don't think this is hard
to
>> > > >> follow.
>> > > >>
>> > > >> I think you can do what you like: make a JIRA and attach files.
>> Make a
>> > > >> WIP PR and attach your notes. Make a Google Doc if you're feeling
>> > > >> transgressive.
>> > > >>
>> > > >> I don't see much of a problem to solve here. In practice there
are
>> > > >> plenty of workable options, all of which are mainstream, and so
I
>> do
>> > > >> not see an argument that somehow this is solved by letting people
>> make
>> > > >> wikis.
>> > > >>
>> > > >> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
>> > > >> <punya.biswal@gmail.com> wrote:
>> > > >>> Okay, I can understand wanting to keep Git history clean,
and
>> avoid
>> > > >>> bottlenecking on committers. Is it reasonable to establish
a
>> > > convention of
>> > > >>> having a label, component or (best of all) an issue type for
>> issues
>> > > that are
>> > > >>> associated with design docs? For example, if we used the existing
>> > > >>> "Brainstorming" issue type, and people put their design doc
in the
>> > > >>> description of the ticket, it would be relatively easy to
figure
>> out
>> > > what
>> > > >>> designs are in progress.
>> > > >>>
>> > > >>> Given the push-back against design docs in Git or on the wiki
and
>> the
>> > > strong
>> > > >>> preference for keeping docs on ASF property, I'm a bit surprised
>> that
>> > > all
>> > > >>> the existing design docs are on Google Docs. Perhaps Apache
should
>> > > consider
>> > > >>> opening up parts of the wiki to a larger group, to better
serve
>> this
>> > > use
>> > > >>> case.
>> > > >>>
>> > > >>> Punya
>> > > >>>
>> > > >>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <
>> pwendell@gmail.com>
>> > > wrote:
>> > > >>>>
>> > > >>>> Using our ASF git repository as a working area for design
docs,
>> it
>> > > >>>> seems potentially concerning to me. It's difficult process
wise
>> > > >>>> because all commits need to go through committers and
also, we'd
>> > > >>>> pollute our git history a lot with random incremental
design
>> > updates.
>> > > >>>>
>> > > >>>> The git history is used a lot by downstream packagers,
us during
>> our
>> > > >>>> QA process, etc... we really try to keep it oriented around
code
>> > > >>>> patches:
>> > > >>>>
>> > > >>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
>> > > >>>>
>> > > >>>> Committing a polished design doc along with a feature,
maybe
>> that's
>> > > >>>> something we could consider. But I still think JIRA is
the best
>> > > >>>> location for these docs, consistent with what most other
ASF
>> > projects
>> > > >>>> do that I know.
>> > > >>>>
>> > > >>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <
>> cody@koeninger.org
>> > >
>> > > >>>> wrote:
>> > > >>>>> Why can't pull requests be used for design docs in
Git if people
>> > who
>> > > >>>>> aren't
>> > > >>>>> committers want to contribute changes (as opposed
to just
>> > comments)?
>> > > >>>>>
>> > > >>>>> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <sowen@cloudera.com>
>> > > wrote:
>> > > >>>>>
>> > > >>>>>> Only catch there is it requires commit access
to the repo. We
>> > need a
>> > > >>>>>> way for people who aren't committers to write
and collaborate
>> (for
>> > > >>>>>> point #1)
>> > > >>>>>>
>> > > >>>>>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
>> > > >>>>>> <punya.biswal@gmail.com> wrote:
>> > > >>>>>>> Sandy, doesn't keeping (in-progress) design
docs in Git
>> satisfy
>> > the
>> > > >>>>>> history
>> > > >>>>>>> requirement? Referring back to my Gradle example,
it seems
>> that
>> > > >>>>>>>
>> > > >>>>>>
>> > > >>>>>>
>> > >
>> >
>> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
>> > > >>>>>>> is a really good way to see why the design
doc evolved the
>> way it
>> > > >>>>>>> did.
>> > > >>>>>> When
>> > > >>>>>>> keeping the doc in Jira (presumably as an
attachment) it's not
>> > easy
>> > > >>>>>>> to
>> > > >>>>>> see
>> > > >>>>>>> what changed between successive versions of
the doc.
>> > > >>>>>>>
>> > > >>>>>>> Punya
>> > > >>>>>>>
>> > > >>>>>>> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza
<
>> > > sandy.ryza@cloudera.com>
>> > > >>>>>> wrote:
>> > > >>>>>>>>
>> > > >>>>>>>> I think there are maybe two separate things
we're talking
>> about?
>> > > >>>>>>>>
>> > > >>>>>>>> 1. Design discussions and in-progress
design docs.
>> > > >>>>>>>>
>> > > >>>>>>>> My two cents are that JIRA is the best
place for this.  It
>> > allows
>> > > >>>>>> tracking
>> > > >>>>>>>> the progression of a design across multiple
PRs and
>> > > contributors.  A
>> > > >>>>>> piece
>> > > >>>>>>>> of useful feedback that I've gotten in
the past is to make
>> > design
>> > > >>>>>>>> docs
>> > > >>>>>>>> immutable.  When updating them in response
to feedback, post
>> a
>> > new
>> > > >>>>>> version
>> > > >>>>>>>> rather than editing the existing one.
 This enables tracking
>> the
>> > > >>>>>> history of
>> > > >>>>>>>> a design and makes it possible to read
comments about
>> previous
>> > > >>>>>>>> designs
>> > > >>>>>> in
>> > > >>>>>>>> context.  Otherwise it's really difficult
to understand why
>> > > >>>>>>>> particular
>> > > >>>>>>>> approaches were chosen or abandoned.
>> > > >>>>>>>>
>> > > >>>>>>>> 2. Completed design docs for features
that we've implemented.
>> > > >>>>>>>>
>> > > >>>>>>>> Perhaps less essential to project progress,
but it would be
>> > really
>> > > >>>>>> lovely
>> > > >>>>>>>> to have a central repository to all the
projects design
>> doc.  If
>> > > >>>>>>>> anyone
>> > > >>>>>>>> wants to step up to maintain it, it would
be cool to have a
>> wiki
>> > > >>>>>>>> page
>> > > >>>>>> with
>> > > >>>>>>>> links to all the final design docs posted
on JIRA.
>> > > >>>>>>>>
>> > > >>>>>>
>> > > >
>> > > >
>> ---------------------------------------------------------------------
>> > > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> > > > For additional commands, e-mail: dev-help@spark.apache.org
>> > > >
>> > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> > > For additional commands, e-mail: dev-help@spark.apache.org
>> > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message