hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject [DISCUSSION] development process of Hadoop
Date Thu, 05 May 2011 02:39:51 GMT
If we reflect back and see how the development community end up in its current state for Hadoop.
 There are development rapidly happening and tested in all kind of organizations.  However,
Hadoop committers are only committing code that are interested by the sponsored companies.
 People are coding defensively to ensuring only self serving patches would be committed, and
helping others and merging problem are always prioritized secondary.  While the world demand
agility, the "review then commit" process is preventing progress from happening.  Committers
are afraid to commit patches because review hasn't took place.  By the time patch is reviewed,
it does not apply properly.  People end up having to generate multiple version of patches
to ensure the code can be applied.  The large lag time between patch generation and reviewed
is taking significant toll on the community and progress.

Yahoo have a great team of developers who improves Hadoop at faster pace with its own fork
of the source code.  The reason that Yahoo was able to achieve faster improvement with features
was due to the ability to use source code repository tools properly.  Unfortunate for Yahoo,
their source code repository was not Apache svn trunk.  I applause Owen and Arun's effort
for men powering and backward/forward porting the changes between yahoo github and Apache
svn.  There might be some jiras that needs to be merged into Hadoop 0.20.203 branch to ensure
the linage is correct.  The community should offer to help with detail listing of what is
missing rather than vote -1 without concise reasoning of what is missing.

JIRA is meant as a discussion and collaboration tool, but hadoop community intends to use
it as the source code version control system with men powered diff maker.  While spending
time in the incubator with other project, the mentors have explained that it is not ASF's
philosophy to use "review then commit".  Hadoop community should rethink if the community
is using the right tools for the right task.

Use JIRA, if there is large feature set that requires brain storming, and developers should
have the ability to make small incremental changes without RTC.  This will ensure developers
help each other rather than policing each other.

Any thoughts?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message