hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hsieh <...@cloudera.com>
Subject Re: ANNOUNCE: hbase 0.96.1.1rc0 release candidate is available for download.
Date Thu, 19 Dec 2013 17:21:02 GMT
Ted,

My question really boils down this this -- can we have a data loss if we
don't take in HBASE-10142 or not?  It will take me a little more time
reverse engineer and convince myself one way or another.    If we can lose
data, then I'll roll take the port and do another rc.  If we don't (we are
just less performant) then since we have 4 binding +1's other than me I'll
release.

I either case, I want to understand how it fixes the problem. I can see the
changes e.g. the re-entrant lock, and the shifting out of
checkLowReplication call, but I need to figure out why the changes were
necessary. The main substantive change is that the lock is taken around the
checkLowReplication call but I haven't put together why this fixes the
problem.  Can you shed more light (and ideally lay out why the changes fix
the problem in jira? -- maybe a bad multithread trace that is fixed by the
new lock?)

Thanks,
Jon.


On Thu, Dec 19, 2013 at 9:01 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Jon:
> If Stack gives the greenlight, I can certainly port it to 0.96 branch.
>
> Cheers
>
>
> On Thu, Dec 19, 2013 at 8:39 AM, Jonathan Hsieh <jon@cloudera.com> wrote:
>
> > When I run the test standalone,  and didn't have an failure in
> 0.96.1.1rc0
> > or 0.96.1. When I ran the whole suite, I ran into exactly the same
> failure
> > on 0.96.1.1 (currnetly testing full suite from 0.96.1 src tar ball)
> >
> > I've spent some time reviewing HBASE-10142, There are some non-test code
> > modifications still trying to determine if it is a serious problem or not
> > on that side.   Ted, is there a reason why this wasn't ported to the 0.96
> > branch?
> >
> > Jon.
> >
> >
> > On Wed, Dec 18, 2013 at 9:00 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > Typo is because I have no done a cut&past ;)
> > >
> > > With cut&past: mvn test -PrunAllTests
> -Dsurefire.secondPartThreadCount=8
> > >
> > > On the last run, error is:
> > > Failed tests:
> > >
> > >
> >
> testLogRollOnDatanodeDeath(org.apache.hadoop.hbase.regionserver.wal.TestLogRolling):
> > > LowReplication Roller should've been disabled, current replication=1
> > >
> > > I did not keep the errors from the previous runs...
> > >
> > >
> > > 2013/12/18 Ted Yu <yuzhihong@gmail.com>
> > >
> > > > What error(s) did you see ?
> > > >
> > > > There was a typo in this def ('s' between D and u):
> > > > -Dsurefire.secondPartThreadCount
> > > >
> > > > You can lower the thread count: in trunk build, value of 2 is used.
> > > >
> > > > Cheers
> > > >
> > > >
> > > > On Wed, Dec 18, 2013 at 8:45 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org> wrote:
> > > >
> > > > > I tried multiple times over many hours to run:
> > > > > mvn test -PrunallTests -Dusefire.secondPartThreadCount=8
> > > > >
> > > > > On a local machine using the src jar, with no success. I might be
> > > missing
> > > > > something... I will investigate so I will be able to provide better
> > > > > feedback for 0.96.2...
> > > > >
> > > > > Sorry about that.
> > > > >
> > > > >
> > > > > 2013/12/18 Enis Söztutar <enis.soz@gmail.com>
> > > > >
> > > > > > +1.
> > > > > >
> > > > > > - downloaded the artifacts
> > > > > > - checked checksums
> > > > > > - checked sigs
> > > > > > - checked hadoop libs in h1 / h2
> > > > > > - checked directory layouts
> > > > > > - run local cluster
> > > > > > - run smoke tests with shell on the artifacts
> > > > > > - run tests locally:
> > > > > >   -- bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write
> > > > 10:10:100
> > > > > > -num_keys 1000000 -read 100:30
> > > > > >   -- bin/hbase
> > > > "org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList
> > > > > > Loop 1 1 3000000 /tmp/biglinkedlist 1"
> > > > > >
> > > > > >
> > > > > > On Wed, Dec 18, 2013 at 12:45 PM, Stack <stack@duboce.net>
> wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > Downloaded, unbundled, checked layout, and ran it.  Browsed
the
> > mvn
> > > > > > > artifacts.
> > > > > > >
> > > > > > > St.Ack
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Dec 17, 2013 at 2:23 PM, Jonathan Hsieh <
> > jon@cloudera.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > This is a quick-fix release directly off of the 0.96.1
> release.
> > > > It
> > > > > > can
> > > > > > > be
> > > > > > > > downloaded here:
> > > > > > > >
> > > > > > > > http://people.apache.org/~jmhsieh/hbase-0.96.1.1rc0/
> > > > > > > >
> > > > > > > > The maven staging repo is here:
> > > > > > > >
> > > > > > > >
> > > > >
> > https://repository.apache.org/content/repositories/orgapachehbase-056/
> > > > > > > >
> > > > > > > > There is only one jira'ed patch in this release, HBASE-10188
> > [1].
> > > > >  This
> > > > > > > > fixes an API incompatibility introduced between 0.96.0
and
> > > 0.96.1.
> > > > > > Other
> > > > > > > > changes include updates to CHANGES.txt (to include
that
> jira),
> > > and
> > > > > > > pom.xml
> > > > > > > > (naming the release 0.96.1.1).
> > > > > > > >
> > > > > > > > As such, we'll have an abridged testing and voting
period.
> > > Please
> > > > > > have
> > > > > > > a
> > > > > > > > quick look and vote +1/-1 by 12/18/13 23:59 pacific
time. If
> > this
> > > > > > passes
> > > > > > > > we'll take down 0.96.1.
> > > > > > > >
> > > > > > > > Please check the release mechanics -- this is my first
> attempt
> > at
> > > > an
> > > > > > > hbase
> > > > > > > > release.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/HBASE-10188
> > > > > > > >
> > > > > > > > --
> > > > > > > > // Jonathan Hsieh (shay)
> > > > > > > > // Software Engineer, Cloudera
> > > > > > > > // jon@cloudera.com
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // Software Engineer, Cloudera
> > // jon@cloudera.com
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message