lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-dev] Grant plan
Date Sun, 12 Sep 2010 18:56:02 GMT
On Sat, Sep 11, 2010 at 10:02:35AM -0700, Mattmann, Chris A (388J) wrote:
> > There are also people in the KinoSearch svn logs who are credited for having
> > identified bugs or provided ideas, but who did not supply patches or whose
> > patches were not incorporated into the code base.  I don't think we need to
> > contact these individuals, but we should clarify the status of these commit
> > messages.
> To be honest -- I think that's pushing it and not necessary. We don't need
> to contact every person that's ever posted a JIRA comment or provided an
> idea.

I meant that I would document the commits in question, not that we would
contact the individuals.

For instance, there's a fellow named Edward Betts who's credited 4 times in
the SVN logs.  Edward submitted several test cases which exposed low level
bugs.  However, as hard as he worked on those test cases, they were never
integrated into the code base, because the test cases themselves were all high
level; the final commits typically include either a test I wrote for the
failing lower level component or no test at all.

I don't think we need to contact Edward; if he doesn't participate in the
grant, there are no commits that need to be backed out.   However, since I
didn't clearly differentiate in the commit messages between when a patch was
integrated into the code base (requiring participation in the grant) versus
when someone simply reported a bug, I thought it would be worthwhile to enter
that information into the public record.

I think that in addition to Edward, there are two other people who's names are
in the SVN logs but who never had IP integrated: Andreas Koenig and Marco
Borromeo.  There's also one more in a similar situation, though he's only
credited in the Changes file: Henry Combrinck.  Henry and Edward in particular
have been valuable community members; that doesn't change even if there's no
requirement that they participate in the grant.

Note for fellow Lucy podling committers: it's theoretically possible to avoid
such murkiness going forward by adhering to some best practice recommendations
from the Apache developer documentation:

    You need to make sure that the commit message contains at least the name of
    the contributor and ideally a reference to the Bugzilla or JIRA issue
    where the patch was submitted. The reasons: this preserves the legal trail
    and makes sure that contributors are recognized. Obviously, the latter
    doesn't mean it's not a good idea to list the names of all contributors
    somewhere on the website. To make it easier to "grep" for commits with
    patches from contributors, always use the same pattern in the commit
    message. Traditionally, we use "Submitted by: <name>" or "Obtained from:

    Here's an example of what such a commit message could look like:

    Bugzilla #43835:
    Added some cool new feature.
    Submitted by: John Doe <>

Basically, names should be listed in SVN logs if and only if there is a
copyright interest.  For moral credit, the mailing list archives suffice.

(I thought I'd seen a note about that somewhere in the dev documentation, but
I can't seem to track it down right now.)

> > To track all of these issues, I think we should open a JIRA issue entitled
> > "Software Grant Participants".
> +1, done, here [1]. I also cleaned up JIRA a bit, changing the Lucy URL to
> [2], and adding components for documentation, the website and a few other
> things so that we could classify issues better. 

Sounds good, thanks.

> One thing I did was remove "Core -" in front of all of the comps -- I think
> it's pretty clear that they are core.

OK, works for me.

> > Lastly, all references to the current GPL/Artistic licensing need to be
> > excised.  My current thought is to remove the licensing information but leave
> > the copyright notices with my name in place as sort of a "todo" tag.
> You can do an SVN import with the GPL license header in the source code and
> in the license files, it's fine. This should *not* prevent us from doing
> that. We just can't release software with these tags in them at Apache. So,
> before making an Incubation release, the tags need to be removed. My
> recommendation:
> 1. svn drop as is
> 2. create JIRA issue (similar to OODT-3 [3]) to fix license headers/etc.
> 3. work on JIRA issue and resolve it before releasing.

OK, sounds good.  I was unclear based on those enormous threads in
legal-discuss from 2008 and 2009 about the status of Subversion as a
distribution.  After going back to review them one more time, I see that we
are legally OK with GPL'd code from an imported tarball; it's just an ASF
*policy* (not a legal requirement) that everything in SVN be ASL'd... (Sam Ruby)

    Legally, if we follow the terms of the applicable licenses, we can
    distribute artifacts that are not compatible with the Apache License,
    Version 2.0. 

... and the Incubator makes an exception for imports.

> > I am currently reviewing the KinoSearch commit history commit by commit
> > looking for IP issues and assembling an authoritative list of contributors.
> > This is laborious, but it is important work; the audit has yielded one
> > additional name for the software grant participant list (see LUCENE-675,
> > <>).  I think it will take me another week or two to
> > complete my review.  Once that's done, I'll branch and tag the KinoSearch
> > repository and prepare the grant tarball and checksum.
> OK, when you are ready let me know. I think having someone besides you do
> the import is critical (remember: single point of failure?) :) 


I believe that as a matter of formal process, the Incubator PMC has the
binding vote on accepting the software grant.  However, I think we should also
consider holding a lazy-consensus vote of the Lucy PPMC as to whether we
accept the grant.  Nobody else on the PPMC is going to be reviewing the commit
history like I have been, but as a collective we should be making an effort to
follow along with the process.

> I'll throw my name into the hat to svn import it, since I worked with Joe S.
> to get it done on OODT. Any other mentors want to do it, just let me know
> and I'll stand down.

This step is pretty much mechanical, though, right?  It's just verifying the
MD5, unpacking the tarball into an "import" directory, doing a big "svn add"
and committing everything.  In the proposal, we mention that the code source
will be a "snapshot" from the KinoSearch repository -- we're not planning to
import the entire SVN history.

> > I'm thinking that I should send a private mail to each of the small
> > contributors like the following.
> > 
> >     Greetings [name of valued contributor],
> > 
> > [...]
> My honest assessment of this step: *overkill*. You know who the significant
> contributors to Lucy are, I would imagine by now. Just get them to sign the
> SGA and we're fine.
> I think going through the tedious process of sending these emails over the
> fence, then waiting for them to throw one back over will only continue to
> delay, and the risk of a contributor getting angry over not being part of an
> SGA when her major contribution was a comment on a JIRA issue, or an email
> RE: design sent to an ML, is little to none.
> OTOH, if you feel the person has contributed more than a comment or two, or
> a design email or two, then by all means, include them in the SGA.

OK, then how about we just scratch the email plan and err on the side of
putting people into the grant if they supplied any IP that was integrated into
the project?  We're only talking about 10 people or so anyway, since most
people who made drive-by contributions went on to make bigger ones later.  

If I understand correctly, if some grant participants fail to send in their
SGA forms, we just have to reverse their contributions before we can make a
release -- it doesn't invalidate the whole grant.  However, since that
assertion only appears in the Mentor documentation, I'm not 100% certain of

    It may take some time to track down all contributors. It is not necessary
    to have paperwork on file for all contributions before the code is
    imported. It may be necessary to reverse some patches and rewrite areas of
    code if contributors cannot be found or at not happy about given Apache
    written permission to use their code. 

... and it might conflict with the docs the IP clearance page:

    It is recommended that the software grant form is modified in order to
    have a line for each party so the completeness of the paperwork can be
    verified upon receipt. 

Hopefully all these folks can all be contacted and will get on board anyway so
we won't have to worry about reversing their contributions.

Marvin Humphrey

View raw message