accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: [VOTE] Apache Accumulo 1.6.1 RC1
Date Fri, 26 Sep 2014 01:13:08 GMT
That seems like a reason to vote -1 (and perhaps to encourage others to do
so also). I'm not sure this can be helped so long as people have different
criteria for their vote, though. If we can fix those issues, I'm ready to
vote on a 1.6.2 :)


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Thu, Sep 25, 2014 at 2:42 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> I'm a little concerned we had two +1's that mention failures. The one time
> when we're supposed to have a clean run through, we have 50% of the
> participators noticing failure. It doesn't instill much confidence in me.
>
> On Thu, Sep 25, 2014 at 2:18 PM, Josh Elser <josh.elser@gmail.com> wrote:
>
> > Please make a ticket for it and supply the MAC directories for the test
> > and the failsafe output.
> >
> > It doesn't fail for me. It's possible that there is some edge case that
> > you and Bill are hitting that I'm not.
> >
> >
> > Corey Nolet wrote:
> >
> >> I'm seeing the behavior under Max OS X and Fedora 19 and they have been
> >> consistently failing for me. I'm thinking ACCUMULO-3073. Since others
> are
> >> able to get it to pass, I did not think it should fail the vote solely
> on
> >> that but I do think it needs attention, quickly.
> >>
> >> On Thu, Sep 25, 2014 at 10:43 AM, Bill Havanki<
> bhavanki@clouderagovt.com>
> >> wrote:
> >>
> >>  I haven't had an opportunity to try it again since my +1, but prior to
> >>> that
> >>> it has been consistently failing.
> >>>
> >>> - I tried extending the timeout on the test, but it would still time
> out.
> >>> - I see the behavior on Mac OS X and under CentOS. (I wonder if it's a
> >>> JVM
> >>> thing?)
> >>>
> >>> On Wed, Sep 24, 2014 at 9:06 PM, Corey Nolet<cjnolet@gmail.com>
> wrote:
> >>>
> >>>  Vote passes with 4 +1's and no -1's.
> >>>>
> >>>> Bill, were you able to get the IT to run yet? I'm still having
> timeouts
> >>>>
> >>> on
> >>>
> >>>> my end as well.
> >>>>
> >>>>
> >>>> On Wed, Sep 24, 2014 at 1:41 PM, Josh Elser<josh.elser@gmail.com>
> >>>>
> >>> wrote:
> >>>
> >>>> The crux of it is that both of the errors in the CRC where single bit
> >>>>> "variants".
> >>>>>
> >>>>> y instead of 9 and p instead of 0
> >>>>>
> >>>>> Both of these cases are a '1' in the most significant bit of the
byte
> >>>>> instead of a '0'. We recognized these because y and p are outside
of
> >>>>>
> >>>> the
> >>>
> >>>> hex range. Fixing both of these fixes the CRC error (manually
> >>>>>
> >>>> verified).
> >>>
> >>>> That's all we know right now. I'm currently running memtest86. I do
> not
> >>>>> have ECC ram, so it *is* theoretically possible that was the cause.
> >>>>>
> >>>> After
> >>>
> >>>> running memtest for a day or so (or until I need my desktop functional
> >>>>> again), I'll go back and see if I can reproduce this again.
> >>>>>
> >>>>>
> >>>>> Mike Drob wrote:
> >>>>>
> >>>>>  Any chance the IRC chats can make it only the ML for posterity?
> >>>>>>
> >>>>>> Mike
> >>>>>>
> >>>>>> On Wed, Sep 24, 2014 at 12:04 PM, Keith Turner<keith@deenlo.com>
> >>>>>>
> >>>>> wrote:
> >>>>
> >>>>>   On Wed, Sep 24, 2014 at 12:44 PM, Russ Weeks<
> >>>>>>
> >>>>> rweeks@newbrightidea.com>
> >>>
> >>>> wrote:
> >>>>>>>
> >>>>>>>   Interesting that "y" (0x79) and "9" (0x39) are one bit
"away"
> from
> >>>>>>>
> >>>>>> each
> >>>>
> >>>>> other. I blame cosmic rays!
> >>>>>>>>
> >>>>>>>>   It is interesting, and thats only half of the story.
 Its been
> >>>>>>>>
> >>>>>>> interesting
> >>>>>>> chatting w/ Josh about this on irc and hearing about his
findings.
> >>>>>>>
> >>>>>>>
> >>>>>>>   On Wed, Sep 24, 2014 at 9:05 AM, Josh Elser<josh.elser@gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>  The offending keys are:
> >>>>>>>>
> >>>>>>>>> 389a85668b6ebf8e 2ff6:4a78 [] 1411499115242
> >>>>>>>>>>>
> >>>>>>>>>>> 3a10885b-d481-4d00-be00-0477e231ey65:000000008576b169:
> >>>>>>>>>>> 0cd98965c9ccc1d0:ba15529e
> >>>>>>>>>>>
> >>>>>>>>>>>   The careful eye will notice that the UUID
in the first
> >>>>>>>>>>> component
> >>>>>>>>>>>
> >>>>>>>>>> of
> >>>>
> >>>>> the
> >>>>>>>>> value has a different suffix than the next corrupt
key/value
> (ends
> >>>>>>>>>
> >>>>>>>> with
> >>>>
> >>>>> "ey65" instead of "e965"). Fixing this in the Value and re-running
> >>>>>>>>>
> >>>>>>>> the
> >>>>
> >>>>> CRC
> >>>>>>>>
> >>>>>>>>  makes it pass.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>    and
> >>>>>>>>>
> >>>>>>>>>  7e56b58a0c7df128 5fa0:6249 [] 1411499311578
> >>>>>>>>>>
> >>>>>>>>>>> 3a10885b-d481-4d00-be00-0477e231e965:0000p000872d60eb:
> >>>>>>>>>>> 499fa72752d82a7c:5c5f19e8
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>
> >>> --
> >>> // Bill Havanki
> >>> // Solutions Architect, Cloudera Govt Solutions
> >>> // 443.686.9283
> >>>
> >>>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message