hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhruba Borthakur <dhr...@gmail.com>
Subject Re: Hadoop 0.20.0
Date Fri, 27 Feb 2009 18:10:59 GMT
I would like to make HADOOP-5332 be part of the 0.19, 0.20 and trunk. This
ensures that "append" is switched off by default. At the same time, we would
need patches for HADOOP-4739, HADOOP-4663 and HADOOP-5027. These three are
critical to support "appends".

A series of offline discussions have been summarised in HADOOP-4663. I have
not yet got comments on this summary but I am already working on it and will
post a patch early next week.

thanks,
dhruba

On Fri, Feb 27, 2009 at 9:20 AM, Doug Judd <doug@zvents.com> wrote:

> I'd like to second that.  I think it would be good to have the database
> elevated to a first class use case for HDFS.  Getting fsync() working
> properly is critical for HBase, Hypertable, or any database built on top of
> HDFS.
>
> - Doug
>
> On Fri, Feb 27, 2009 at 9:08 AM, Jim Kellerman (POWERSET) <
> Jim.Kellerman@microsoft.com> wrote:
>
> > I'd really like to see 4379 in 0.19.2 and 0.20.1 if possible.
> > We are really hurting without it.
> >
> > ---
> > Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> >
> > > -----Original Message-----
> > > From: Nigel Daley [mailto:ndaley@yahoo-inc.com]
> > > Sent: Thursday, February 26, 2009 9:13 PM
> > > To: core-dev@hadoop.apache.org
> > > Subject: Re: Hadoop 0.20.0
> > >
> > > Thanks Jim.
> > >
> > > Dhruba, can we move
> > > https://issues.apache.org/jira/browse/HADOOP-4379
> > > to 0.21.0?
> > >
> > > Nige
> > >
> > > On Feb 26, 2009, at 10:37 AM, Jim Kellerman (POWERSET) wrote:
> > >
> > > > With the availability of HADOOP-5332 I remove my objection.
> > > >
> > > >> -----Original Message-----
> > > >> From: Dhruba Borthakur [mailto:dhruba@gmail.com]
> > > >> Sent: Wednesday, February 25, 2009 9:32 PM
> > > >> To: core-dev@hadoop.apache.org
> > > >> Subject: Re: Hadoop 0.20.0
> > > >>
> > > >> I posted a patch for HADOOP-5332. I am suggesting that this patch
be
> > > >> applied
> > > >> into the 0.19, 0.20 and trunk. This patch switches off "append" by
> > > >> default,
> > > >> but it can be switched on by setting the config parameter
> > > >> dfs.support.append. This does not mean that "append" is bug free in
> > > >> the
> > > >> code, it just allows developers to continue testing with append
> > > >> functionality till the bugs are fixed.
> > > >>
> > > >> thanks,
> > > >> dhruba
> > > >>
> > > >> On Wed, Feb 25, 2009 at 9:05 PM, Hemanth Yamijala <yhemanth@yahoo-
> > > >> inc.com>wrote:
> > > >>
> > > >>> +1 for HADOOP-5332. I am in the same position as Brian, as an
> > > >>> outside
> > > >>> observer. This will help us to move on Hadoop 0.20 which has a
lot
> > > >>> of
> > > >> other
> > > >>> features as well that users are asking for.
> > > >>>
> > > >>> Thanks
> > > >>> hemanth
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On Feb 25, 2009, at 10:20 PM, Nigel Daley wrote:
> > > >>>>
> > > >>>>
> > > >>>>> On Feb 25, 2009, at 7:52 PM, Dhruba Borthakur wrote:
> > > >>>>>
> > > >>>>> "Whipping out a patch" says nothing about its reliability.
> > > >>>>>>
> > > >>>>>> i would like some focus from the developer's community
to
> > > >>>>>> properly
> > > >> fix
> > > >>>>>> this
> > > >>>>>> issue. I am willing to spend as much as time it takes
ot get it
> > > >> fixed
> > > >>>>>> the
> > > >>>>>> right way, I but I would like even more constructive
engagement
> > > >> from
> > > >>>>>> more
> > > >>>>>> people to get this one right. May I request you to
see if you
> can
> > > >>>>>> volunteer
> > > >>>>>> to spend some time testing some of this code at scale
?(I have
> > > >> access to
> > > >>>>>> 10
> > > >>>>>> machines only for testing).
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>> Dhruba, can you define "testing some of this code at scale"?
Do
> you
> > > >> simply
> > > >>>> need access or folks who can run challenging jobs? Scaring
up
> > > >>>> access
> > > >> to the
> > > >>>> cluster can be easy, but admin / user time isn't really available.
> > > >>>>
> > > >>>> Sorry, I can't commit any time/resources to this right now.
> Perhaps
> > > >> some
> > > >>>>> hbase folks can. In the meantime, can we make append
> > > >>>>> configurable in
> > > >> 0.19.2
> > > >>>>> and 0.20.0? I filed
> > > >>>>> https://issues.apache.org/jira/browse/HADOOP-5332
> > > >>>>>
> > > >>>>
> > > >>>> As an outside, irrelevant observer, I think this is a really
good
> > > >>>> compromise. Helps out HBase but also would help prevent rushing.
> > > >>>>
> > > >>>> Brian
> > > >>>>
> > > >>>>
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Nige
> > > >>>>>
> > > >>>>>
> > > >>>>>>
> > > >>>>>> thanks
> > > >>>>>> dhruba
> > > >>>>>>
> > > >>>>>> On Wed, Feb 25, 2009 at 7:34 PM, Nigel Daley <
> > ndaley@yahoo-inc.com
> > > >>>>>> >
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>> On Feb 24, 2009, at 9:28 PM, Dhruba Borthakur
wrote:
> > > >>>>>>>
> > > >>>>>>> Hi Jim,
> > > >>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> I can understand your problem. I can probably
whip out a fix
> > > >>>>>>>> for
> > > >>>>>>>> HADOOP-4663 and HADOOP-4379 by the end of
this week. It would
> > > >>>>>>>> be
> > > >> nice
> > > >>>>>>>> if
> > > >>>>>>>> somebody else (Hairong, Sanjay, Konstantin?)
can volunteer to
> > > >> discuss
> > > >>>>>>>> and
> > > >>>>>>>> review the patches/fixes.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>> "Whipping out a patch" doesn't give me any confidence
that this
> > > >> feature
> > > >>>>>>> will be fixed properly. We're building a file
system. Data
> > > >> reliability
> > > >>>>>>> and
> > > >>>>>>> accuracy are absolutely key. We know that this
feature has been
> > > >> very
> > > >>>>>>> lightly tested.
> > > >>>>>>>
> > > >>>>>>> Nigel: wht is the proposed deadline for 0.20?
> > > >>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>> March 6.
> > > >>>>>>>
> > > >>>>>>> Nige
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> thanks,
> > > >>>>>>>
> > > >>>>>>>> dhruba
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Tue, Feb 24, 2009 at 4:25 PM, Jim Kellerman
(POWERSET) <
> > > >>>>>>>> Jim.Kellerman@microsoft.com> wrote:
> > > >>>>>>>>
> > > >>>>>>>> --1
> > > >>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> HBase really needs 4379. My testing to
date indicates that it
> > > >> does
> > > >>>>>>>>> work
> > > >>>>>>>>> (although I have a bit more testing to
do).
> > > >>>>>>>>>
> > > >>>>>>>>> I was ok with not putting it into 0.19.1
provided it was in
> > > >> 0.19.2
> > > >>>>>>>>> and
> > > >>>>>>>>> 0.20.0.
> > > >>>>>>>>>
> > > >>>>>>>>> It's a big problem for us now and is hurting
our ability to
> > > >>>>>>>>> keep
> > > >> our
> > > >>>>>>>>> community alive. (They will go to Cassandra
or something
> > > >>>>>>>>> else to
> > > >>>>>>>>> ensure
> > > >>>>>>>>> reliability).
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> -----Original Message-----
> > > >>>>>>>>>
> > > >>>>>>>>>> From: Nigel Daley [mailto:ndaley@yahoo-inc.com]
> > > >>>>>>>>>> Sent: Tuesday, February 24, 2009 4:02
PM
> > > >>>>>>>>>> To: core-dev@hadoop.apache.org
> > > >>>>>>>>>> Subject: Hadoop 0.20.0
> > > >>>>>>>>>>
> > > >>>>>>>>>> Folks,
> > > >>>>>>>>>>
> > > >>>>>>>>>> Hadoop 0.19.1 is now available with
the file append feature
> > > >>>>>>>>>> disabled.
> > > >>>>>>>>>> It's time to talk about a Hadoop 0.20.0
release.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Hadoop 0.20.0 feature freeze date
was almost 3 months ago.
> > > >>>>>>>>>> The
> > > >> last
> > > >>>>>>>>>> few blockers are now almost fixed
(should be next week)
> > > >>>>>>>>>> except
> > > >> for
> > > >>>>>>>>>> HADOOP-4379. HADOOP-4379 is work that
is needed to properly
> > > >>>>>>>>>> implement
> > > >>>>>>>>>> file append.
> > > >>>>>>>>>>
> > > >>>>>>>>>> *** I propose we move HADOOP-4379
off to release 0.21.0 and
> > > >> apply
> > > >>>>>>>>>> the
> > > >>>>>>>>>> same disabling of file append in Hadoop
0.20.0 that we put
> in
> > > >> place
> > > >>>>>>>>>> to
> > > >>>>>>>>>> get 0.19.1 released (HADOOP-5224 and
HADOOP-5225).
> > > >>>>>>>>>>
> > > >>>>>>>>>> I will call a vote for 0.20.0 when
blockers are fixed.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Cheers,
> > > >>>>>>>>>> Nigel
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Folks,
> > > >>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Some Hadoop deployments have upgraded
to 0.19.0. Clearly,
> > > >>>>>>>>>>> the
> > > >> 0.19
> > > >>>>>>>>>>> branch has issues and a 0.19.1
release is needed.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Quality issues in the changes
made for the file append
> > > >>>>>>>>>>> feature
> > > >> have
> > > >>>>>>>>>>> prevented some from deploying
Hadoop 0.19. One of these
> > > >> changes
> > > >>>>>>>>>>> (sync) has now been "fixed" by
reducing its semantics in
> > > >> Hadoop
> > > >>>>>>>>>>> 0.18.3 (HADOOP-4997). This was
necessary to stabilize the
> > > >>>>>>>>>>> 0.18
> > > >>>>>>>>>>> branch.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I would like to propose that we
apply this same "fix" to
> > > >>>>>>>>>>> sync
> > > >> in
> > > >>>>>>>>>>> 0.19.1 and 0.20.0. Since append
requires the full
> > > >>>>>>>>>>> semantics of
> > > >>>>>>>>>>> sync, I propose we also disable
append (perhaps throw
> > > >>>>>>>>>>> UnsupportedOperationException
from API?). Yes, this would
> > > >>>>>>>>>>> unfortunately be an incompatible
change between 0.19.0 and
> > > >> 0.19.1.
> > > >>>>>>>>>>> We can then take the time needed
to fix append properly in
> > > >> 0.21.0.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I will call a vote for 0.19.1
and 0.20.0 when blockers are
> > > >> fixed.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Nigel
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>
> > > >>>>
> > > >>>
> > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message