On 03/05/11 01:41, Roy T. Fielding wrote:
>
I am constantly amazed at how
> quiet it is in this project, at least until I remember that
> most of the work is done exclusively via jira, unlike any of
> my other followed projects that use jira. I'd suggest that
> the right place to hold any discussion is on the dev list,
> but I am not on that list because it receives way too many
> automated notifications. Maybe it would help discussion on
> dev if notices were sent elsewhere and only discussions were
> held on dev.
I've seen this before on the Maven lists, where there's mostly a stream
of JIRA changes above anything else:
http://mail-archives.apache.org/mod_mbox/maven-dev/200510.mbox/browser
however, they've got no JIRA issues in their list now, which may imply
all changes aren't going to the list, or they arent using it so much:
http://mail-archives.apache.org/mod_mbox/maven-dev/201104.mbox/browser
(pause: bisecting their list shows that in 1.mar.06 they forked JIRA to
a separate list to hide the details of ongoing work)
In some ways it's a means of dealing with a large and fast moving
codebase: you subscribe to the issues that matter to you, all the
discussions on a specific feature are archived, etc.
However, it has some flaws
-discouragement of community, you become a group of people working on
JIRA issues, rather than on a large integrated project
-with work spread across common, hdfs and mapreduce JIRAs and mailing
lists, it's hard to keep all the things in your head -it is pretty much
a full time job to do so. And I don't know about the others, but I don't
have the time.
-we need a way of gently moving people from those who use hadoop to
those who develop it. To me, every end user is a warm engineering
resource we just need to point at a problem that they care about. The
scale of the project, its complexity, JIRA change rate and testing
difficulties are all barriers to entry -you end up needing a team of people
* someone to track all the issues and keep the design in their head
* 1+ person to test
* 1+ person to code
I don't know about others, but I can't do this on my own.
The attempt to split up into HDFS+MAPREDUCE was one tactic to deal with
this, but it hasn't worked, we just have more mailing lists to track (or
in my case, fall behind on).
votewise:
-I'm favour of shipping an apache release of 20.x that has the patches
that Y! and others have added to deal with scale and availability -and
which has been tested by them. This will provide an apache release for
people to use in production systems -because the official apache
releases have lagged the CDH and Y! releases.
-I'd like to see all the changes integrated into trunk too, as it
doesn't make sense for a patch in this branch not to be in trunk.
Steve
|