accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@cloudera.com>
Subject Re: 1.6 to 1.7 performance regression
Date Wed, 07 Jun 2017 05:51:38 GMT
I read through all the internal notes I could find from back in that
testing, and I don't see any mention of changing the durability
settings on meta nor root.

So that's a plausible source for the perf hit. I don't know when I'll
have time to run through some tests to verify.

On Tue, Jun 6, 2017 at 2:43 PM, Josh Elser <josh.elser@gmail.com> wrote:
> (spinning off from the other thread)
>
> The backstory on Sean's testing can be found in [1]. Essentially, in his
> testing, he observed some cases where there was an unexplained ~30%
> performance impact.
>
> <quote
> Batch write performance for Accumulo 1.7.2‐cdh5.5.0 shows a regression of up
> to approximately 30 percent, depending on table shape, when compared to
> Accumulo 1.6.0‐cdh5.1.4. The performance decrease is more severe for
> exceptionally large cells (100k and larger) or exceptionally wide rows (10k
> columns). Carefully consider the performance impact for your environment
> when deciding to upgrade to Accumulo 1.7.2‐cdh5.5.0.
> </quote>
>
> Since it came up again, I was hoping we could put this concern to rest,
> chalking it up to the WAL flush/sync calls that changed between 1.6 and 1.7
> as documented by our Keith[2]. Hopefully, Sean's notes are sufficient for us
> to reconstruct his environment :)
>
> - Josh
>
> [1]
> https://www.cloudera.com/documentation/other/accumulo/latest/PDF/Apache-Accumulo-Installation-Guide-1-7-2.pdf
> [2] https://accumulo.apache.org/blog/2016/11/02/durability-performance.html
>
>
> -------- Forwarded Message --------
> Subject: Re: [DISCUSS] Question about 1.7 bugfix releases
> Date: Tue, 6 Jun 2017 14:20:27 -0400
> From: Josh Elser <josh.elser@gmail.com>
> To: dev@accumulo.apache.org
>
> On 6/6/17 2:13 PM, Sean Busbey wrote:
>>
>> On Tue, Jun 6, 2017 at 12:07 PM, Josh Elser <josh.elser@gmail.com> wrote:
>>>
>>> On 6/6/17 12:39 PM, Sean Busbey wrote:
>>>>
>>>>
>>>> For example, has anyone done perf comparisons between 1.7 and 1.8.z?
>>>>
>>>> When it came time for me to start telling folks that it was "safe" to
>>>> upgrade to 1.7.z I ran into something like a 40-60% perf degradation
>>>> on writes compared to 1.6 across the board. A little bit of this was
>>>> already fixed in 1.8 at the time, but a substantial amount required a
>>>> non-trivial refactoring because just no one had looked[1]. Even after
>>>> all of that, I still had to caveat things because I still saw a
>>>> ~15-30% perf drop on random writes in the presence of lots of columns.
>>>
>>>
>>>
>>> At a risk of de-railing otherwise good discussion on releases: do you
>>> recall
>>> if you had accounted for the following, Sean? (notably, the last code
>>> snippet)
>>>
>>> https://accumulo.apache.org/blog/2016/11/02/durability-performance.html
>>
>>
>> I know that "set durability to flush and not sync" was one of the
>> parameters for the comparison, but I don't remember what was done
>> specifically during the testing back in September, tbh.
>>
>> I can probably dig it out if you'd like; I think we were pretty good
>> at keeping notes. Probably something for a different thread?
>>
>
> Agreed. Just wanted to ask before I forgot again. Saw some relevance in the
> worry of perf regressions 1.7->1.8 based on the existence of those you saw
> 1.6->1.7, but def don't want to derail further here.
>
> If you have the time and the notes, would be happy to review.



-- 
busbey

Mime
View raw message