hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Booth <jaybo...@gmail.com>
Subject Re: Thinking about the next hadoop mainline release
Date Fri, 17 Jun 2011 14:37:46 GMT
I can look at 1323 (hdfs-918's successor) next week/weekend and clear
the test problems, thanks Todd for updating the patch to current
trunk.  1323 is only filechannel-pooling, which is much less
disruptive than refactoring everything in the DN to be event-driven.

On Fri, Jun 17, 2011 at 10:30 AM, Brian Bockelman <bbockelm@cse.unl.edu> wrote:
> Hi Ryan, Eric,
> Just looked at those two for the first time in awhile.
> - HDFS-918 (now 1323?) doesn't seem like it's too controversial, but does seem like there's
a bit of validation left.
> - HDFS-347 has a long, contentious history.  However, it seems that most of the strong
objections have been cleared up.  Is there anyone left who objects to it, now that it doesn't
appear to bypass security?
> Finally, I see Todd has posted HDFS-2080 claiming some sizable performance improvements.
 Would it be possible that could finish in time for release?
> As a site which heavily uses random reads and high-throughput reads, I'm very excited
for this release!
> Brian
> On Jun 17, 2011, at 2:36 AM, Ryan Rawson wrote:
>> HDFS-918 and HDFS-347 are absolutely critical for random read
>> performance.  The smarter sites are already running HDFS-347 (I guess
>> they aren't running "Hadoop" then?), and soon they will be testing and
>> running HDFS-918 as well.  Opening 1 socket for every read just isn't
>> really scalable.
>> -ryan
>> On Fri, Jun 17, 2011 at 12:17 AM, Eric Baldeschwieler
>> <eric14@yahoo-inc.com> wrote:
>>> Hi Folks,
>>> I'd like to start a conversation on mainline planning and the next release of
Apache Hadoop beyond 0.22.
>>> The Yahoo! Hadoop team has been working hard to complete several big Hadoop projects,
>>> - HDFS Federation [HDFS-1052]
>>>  - Already merged into trunk
>>> - Next Generation Map-Reduce [MR-279]
>>>  - Passing most tests now and discussing merging into trunk
>>> - The merging of our previous work on Hadoop with security into mainline [http://yhoo.it/i9Ww8W]
>>>  - This is mostly done, but owen and others are doing a scrub to close out the
remaining issues
>>> All of these projects are now reaching a place where we would like to combine
them with the good work already in 0.22 and put out a new apache release, perhaps 0.23.  We
think the best way to accomplish that is to finish the merge in the next few weeks and then
cut a release from trunk.
>>> Yahoo stands ready to help us (the Apache Hadoop Community) turn this new release
into a stable release by running it through its 9 month test and burn in process.  The result
of that will be another stable release such as 0.18, 0.20 or 0.20.203 (hadoop with security).
 We have Yahoo!s support for this substantial investment because this new release will have
a great combination of new features for small and very large sites alike:
>>>  - New Write Pipeline - HBase support [also in 0.21 & 0.22]
>>>  - Federation - Scale up to larger clusters and the ability to experiment with
new namenode approaches
>>>  - Next Gen MapReduce - Scaleup, performance improvements, ability to experiment
with new processing frameworks
>>> I think this effort will produce a great new Apache Hadoop release for the community.
 I'm starting this thread to collect feedback and hopefully folks' endorsement for merging
in MR-279 and putting together this new release.  Feedback please?
>>> Thanks,
>>> E14

View raw message