accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Potential Performance Regressions
Date Wed, 19 Feb 2014 21:39:41 GMT
TYVM for lifting this into its own thread.

Trying to identify whether the changes you see are related to the entire 
1.5 series or if it's unique to 1.5.1 will be very important.

I'm surprised you only saw a 10% slow down too with 3 repl instead of 2. 
I remember the last time this came up (what Keith referred to), the math 
came out to about a 30% slow down with a repl of 3.

On 2/19/14, 1:05 PM, Mike Drob wrote:
> Subject changed to not clutter the RC thread.
>
> I've been doing testing with HDFS WAL replication of 2 and the default (3)
> and it has not made a huge difference. Probably about 10%. As easy as it is
> to point at that and say that because we are writing 1.5x the data then we
> must have 1.5x the slowdown, I don't think that is the case here.
>
> We'll run a couple more rounds of CI with different settings and report
> back.
>
>
> On Wed, Feb 19, 2014 at 2:39 PM, Keith Turner <keith@deenlo.com> wrote:
>
>> On Wed, Feb 19, 2014 at 2:21 PM, Sean Busbey <busbey+lists@cloudera.com
>>> wrote:
>>
>>> -1 (if we're still counting votes) due to #3 below
>>>
>>> Here's where I am ATM:
>>>
>>> * Verified data integrity for data written in 1.4.4 after upgrade (for a
>>> smattering of rfile options, built from source dist with hadoop 2
>> profile)
>>> * 2x Continuous Ingest 24hr w/verification[1] (built from source dist
>> with
>>> hadoop 2 profile)
>>>      * once for each of RC1 and RC2
>>>      * no agitation due to ACCUMULO-2382 discovered after the fact (NN
>>> failover still present)
>>>      * Significantly fewer cells written than when I last ran on the same
>>> cluster with 1.4.5-6593a9+agitation (7B vs 31B)
>>> * functional tests of binary distro pass on Hadoop 2, given
>> workarounds[2]
>>>
>>> 1) None of the issues I ran into running tests look like blockers;
>> they've
>>> all been filed at this point.
>>>
>>> 2) The significant decrease in write throughput might be concerning, but
>> I
>>> don't know if this was already in 1.5.0 so I'm not flagging it.
>>>
>>
>> There is a difference in replication between 1.4 and 1.5.   1.4 used to
>> replicated data to two loggers.   1.5 using HDFS defaults will replicate
>> walogs to 3 datanodes. ACCUMULO-1083 [1] has some discussion and numbers
>> about this.  There is also ACCUMULO-1905 [2], 1.4 would not call hsync.
>>
>> [1] : https://issues.apache.org/jira/browse/ACCUMULO-1083
>> [2] : https://issues.apache.org/jira/browse/ACCUMULO-1905
>>
>>
>>>
>>> 3) the release notes need to have things broken out by version. Otherwise
>>> you're asking an ops person to go back and look at the 1.5.0 release
>> notes
>>> to determine how 1.5.1 impacts them. For comparison, both Avro and
>> Jackson
>>> (which I consider good exemplars for projects) break out their release
>>> notes to the bugfix[3].
>>>
>>> 4) I'm a little concerned that no one has done a Hadoop 1 test yet.
>>>
>>> [1]: Cluster Specs
>>> OS: CentOS 6.4
>>> Hadoop: CDH 4.5.0 (2.0.0+cdh4.5.0)
>>> ZK: CDH 4.5.0 (3.4.5+cdh4.5.0)
>>> Size: 2 Masters, 5 Workers, HDFS in HA+QJM, 5 ZKs
>>>
>>> [2]: Run on single node (backed by the same cluster Bill mentioned
>> earlier)
>>> OS: CentOS 6.4
>>> Hadoop: CDH 4.5.0 (2.0.0+cdh4.5.0)
>>> ZK: CDH 4.5.0 (3.4.5+cdh4.5.0)
>>> Size: 2 Masters, 5 Workers, HDFS in HA+QJM, 3 ZKs
>>>
>>> [3]: e.g.
>>> http://svn.codehaus.org/jackson/tags/1.8/1.8.9/release-notes/VERSION
>>> https://github.com/apache/avro/blob/trunk/CHANGES.txt
>>>
>>>
>>> On Wed, Feb 19, 2014 at 12:23 PM, Mike Drob <madrob@cloudera.com> wrote:
>>>
>>>> I went back and looked at our release governance page[1] and it does
>>>> explicitly state that votes will be 72 hours. So I was out of line when
>>>> asking you to extend it and I'm not sure that the extension is valid at
>>>> this point anyway. Lack of bylaws makes this a messy process.
>>>>
>>>> In light of this I am changing my vote from +1 to +0, since I did not
>>> vote
>>>> in the original time frame.
>>>>
>>>> [1]: http://accumulo.apache.org/governance/releasing.html#releasing
>>>>
>>>>
>>>> On Sat, Feb 15, 2014 at 7:17 PM, Josh Elser <josh.elser@gmail.com>
>>> wrote:
>>>>
>>>>> Alright, given the snow, holiday, and the lack of bylaws stating
>> that I
>>>>> cannot do this:
>>>>>
>>>>> I'm extending the VOTE on 1.5.1-RC2 until 02/19/2014 1900 EST (this
>>>>> extends the original duration to a week for those keeping track).
>> This
>>> is
>>>>> expected to provide an additional two full work days for people to
>>>> inspect
>>>>> the release.
>>>>>
>>>>> Let's get some good feedback before then, folks.
>>>>>
>>>>> - Josh
>>>>>
>>>>>
>>>>> On 2/15/14, 6:29 PM, Christopher wrote:
>>>>>
>>>>>> Either way works for me.
>>>>>>
>>>>>> I was just suggesting a more formal approach in the absence of
>> bylaws
>>>>>> that explicitly permit extensions. The general concern, I suppose,
>> is
>>>>>> that vote extensions could be used to manipulate to a desired
>> outcome
>>>>>> in a majority approval scheme... so having the vote conditions fixed
>>>>>> at the time it is announced prevents that. I don't think that's a
>>>>>> serious concern, though... especially since we all have the same
>> goal
>>>>>> of producing a quality release, and preventing one that falls short
>> of
>>>>>> that.
>>>>>>
>>>>>> With the bylaws in place, things are simpler, because we'd have
>>>>>> already agreed on those bylaws, and wouldn't need to do anything
>>>>>> silly, like vote on whether to allow a vote extension in the first
>>>>>> place (which would get obnoxious).
>>>>>>
>>>>>> --
>>>>>> Christopher L Tubbs II
>>>>>> http://gravatar.com/ctubbsii
>>>>>>
>>>>>>
>>>>>> On Sat, Feb 15, 2014 at 6:00 PM, Billie Rinaldi
>>>>>> <billie.rinaldi@gmail.com> wrote:
>>>>>>
>>>>>>> On Sat, Feb 15, 2014 at 1:57 PM, Christopher <ctubbsii@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>   A somewhat more formal way of "extending" the vote would be
to
>>> simply
>>>>>>>> retract/cancel this vote (or let it lapse with no votes),
and just
>>>>>>>> re-issue another vote with identical artifacts at a more
opportune
>>>>>>>> time. I point this out for two reasons:
>>>>>>>>
>>>>>>>> 1) I don't want to undermine Josh's work to create this release
>>>>>>>> candidate. He shouldn't have to do that again if nothing
has
>> changed
>>>>>>>> and we just need more time to review. And,
>>>>>>>>
>>>>>>>> 2) The vote was called with a 72hr. notice, and changing
that
>> after
>>>>>>>> the fact is probably a bit questionable. We can achieve the
same
>>>>>>>> effect without modifying the characteristics of the vote,
by
>> simply
>>>>>>>> calling a new vote, identical to this one, later.
>>>>>>>>
>>>>>>>>
>>>>>>> I'm not sure that extending the vote is questionable.  I think
it
>>> would
>>>>>>> be
>>>>>>> fine if Josh just said the vote deadline is extended to X (perhaps
>> an
>>>>>>> additional 72 hours, or maybe event one week from the original
post
>>>> since
>>>>>>> many people have Monday off).  Some Apache projects explicitly
>>> mention
>>>>>>> that
>>>>>>> votes may be extended in their bylaws [1], so that's something
we
>>> could
>>>>>>> consider when we write ours.
>>>>>>>
>>>>>>> But if people would feel more comfortable if Josh reposted the
>> vote,
>>>> I'm
>>>>>>> sure he could do that.  :-)
>>>>>>>
>>>>>>> [1]: https://hc.apache.org/bylaws.html
>>>>>>>
>>>>>>> --
>>>>>>> Christopher L Tubbs II
>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Feb 14, 2014 at 6:09 PM, Christopher <ctubbsii@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> More time would be great. I'll still try to finish up some
testing
>>> by
>>>>>>>>> tomorrow, but I can't make any guarantees.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Christopher L Tubbs II
>>>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Feb 14, 2014 at 12:43 PM, Josh Elser <
>> josh.elser@gmail.com
>>>>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> If people want some extra time given the impact of snow,
please
>>>> inform.
>>>>>>>>>>
>>>>>>>>> I'm
>>>>>>>>
>>>>>>>>> ok with extending this a few days if it means people
will give it
>>>> more
>>>>>>>>>>
>>>>>>>>> love.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2/12/14, 6:50 PM, Josh Elser wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> All,
>>>>>>>>>>>
>>>>>>>>>>> Please consider the following candidate as Apache
Accumulo
>> 1.5.1
>>>>>>>>>>>
>>>>>>>>>>> Git artifacts: The staging repository was built
from the branch
>>>>>>>>>>> "1.5.1-rc2" (c810f51b). No accompanying git tag
was created yet
>>> (as
>>>>>>>>>>> it
>>>>>>>>>>> would be the same exact thing as providing the
above SHA1).
>>>>>>>>>>>
>>>>>>>>>>> Maven Staged Repo:
>>>>>>>>>>>
>>>>>>>>>>>   https://repository.apache.org/content/repositories/
>>>>>>>> orgapacheaccumulo-1001
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> Source tarball:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   http://repository.apache.org/content/repositories/
>>>>>>>> orgapacheaccumulo-1001/org/apache/accumulo/accumulo/1.5.
>>>>>>>> 1/accumulo-1.5.1-src.tar.gz
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Binary tarball:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   http://repository.apache.org/content/repositories/
>>>>>>>> orgapacheaccumulo-1001/org/apache/accumulo/accumulo/1.5.
>>>>>>>> 1/accumulo-1.5.1-bin.tar.gz
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Changes since 1.5.1-RC1: ACCUMULO-1908, ACCUMULO-1935,
>>>> ACCUMULO-2299,
>>>>>>>>>>> ACCUMULO-2329, ACCUMULO-2331, ACCUMULO-2332,
ACCUMULO-2334,
>>>>>>>>>>> ACCUMULO-2337, ACCUMULO-2342, ACCUMULO-2344,
ACCUMULO-2356,
>>>>>>>>>>>
>>>>>>>>>> ACCUMULO-2360
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> Changes since 1.5.0:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=
>>>>>>>> commitdiff;h=d277321d176b71753d391f896f09dc9785173cb0
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Keys: http://www.apache.org/dist/accumulo/KEYS
>>>>>>>>>>>
>>>>>>>>>>> Testing:
>>>>>>>>>>>
>>>>>>>>>>> Manual testing and verification of fixes since
RC1 and 12hr CI
>>> with
>>>>>>>>>>> verification performed. All previously mentioned
testing done
>> for
>>>>>>>>>>> RC1.
>>>>>>>>>>>
>>>>>>>>>>> This vote will be open for the next 72 hours.
>>>>>>>>>>>
>>>>>>>>>>> Upon successful completion of this vote, a 1.5.1
gpg-signed Git
>>> tag
>>>>>>>>>>>
>>>>>>>>>> will
>>>>>>>>
>>>>>>>>> be created from c810f51b and the above staging repository
will be
>>>>>>>>>>> promoted.
>>>>>>>>>>>
>>>>>>>>>>> - Josh
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>
>>>
>>
>

Mime
View raw message