hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: Use experience and performance data of offheap from Alibaba online cluster
Date Sat, 19 Nov 2016 07:35:34 GMT
Because of some compatibility issues, we decide that this will be done
in 2.0 only..  Ya as Andy said, it would be great to share the 1.x
backported patches.  Is it a mega patch at ur end?  Or issue by issue
patches?  Latter would be best.  Pls share patches in some place and a
list of issues backported. I can help with verifying the issues once
so as to make sure we dont miss any...

-Anoop-

On Sat, Nov 19, 2016 at 12:32 AM, Enis Söztutar <enis.soz@gmail.com> wrote:
> Thanks for sharing this. Great work.
>
> I don't see any reason why we cannot backport to branch-1.
>
> Enis
>
> On Fri, Nov 18, 2016 at 9:37 AM, Andrew Purtell <andrew.purtell@gmail.com>
> wrote:
>
>> Yes, please, the patches will be useful to the community even if we decide
>> not to backport into an official 1.x release.
>>
>>
>> > On Nov 18, 2016, at 12:25 PM, Bryan Beaudreault <
>> bbeaudreault@hubspot.com> wrote:
>> >
>> > Is the backported patch available anywhere? Not seeing it on the
>> referenced
>> > JIRA. If it ends up not getting officially backported to branch-1 due to
>> > 2.0 around the corner, some of us who build our own deploy may want to
>> > integrate into our builds. Thanks! These numbers look great
>> >
>> >> On Fri, Nov 18, 2016 at 12:20 PM Anoop John <anoop.hbase@gmail.com>
>> wrote:
>> >>
>> >> Hi Yu Li
>> >>               Good to see that the off heap work help you..  The perf
>> >> numbers looks great.  So this is a compare of on heap L1 cache vs off
>> heap
>> >> L2 cache(HBASE-11425 enabled).   So for 2.0 we should make L2 off heap
>> >> cache ON by default I believe.  Will raise a jira for that we can
>> discuss
>> >> under that.   Seems like L2 off heap cache for data blocks and L1 cache
>> for
>> >> index blocks seems a right choice.
>> >>
>> >> Thanks for the backport and the help in testing the feature..  You were
>> >> able to find some corner case bugs and helped community to fix them..
>> >> Thanks goes to ur whole team.
>> >>
>> >> -Anoop-
>> >>
>> >>
>> >>> On Fri, Nov 18, 2016 at 10:14 PM, Yu Li <carp84@gmail.com> wrote:
>> >>>
>> >>> Sorry guys, let me retry the inline images:
>> >>>
>> >>> Performance w/o offheap:
>> >>>
>> >>>
>> >>> Performance w/ offheap:
>> >>>
>> >>>
>> >>> Peak Get QPS of one single RS during Singles' Day (11/11):
>> >>>
>> >>>
>> >>>
>> >>> And attach the files in case inline still not working:
>> >>>
>> >>> Performance_without_offheap.png
>> >>> <
>> >> https://drive.google.com/file/d/0B017Q40_F5uwbWEzUGktYVIya3JkcXVjRkFvVG
>> NtM0VxWC1n/view?usp=drive_web
>> >>>
>> >>>
>> >>> Performance_with_offheap.png
>> >>> <
>> >> https://drive.google.com/file/d/0B017Q40_F5uweGR2cnJEU0M1MWwtRFJ5YkxUeF
>> VrcUdPc2ww/view?usp=drive_web
>> >>>
>> >>>
>> >>> Peak_Get_QPS_of_Single_RS.png
>> >>> <
>> >> https://drive.google.com/file/d/0B017Q40_F5uwQ2FkR2k0ZmEtRVNGSFp5RUxHM3
>> F6bHpNYnJz/view?usp=drive_web
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> Best Regards,
>> >>> Yu
>> >>>
>> >>>> On 18 November 2016 at 19:29, Ted Yu <yuzhihong@gmail.com>
wrote:
>> >>>>
>> >>>> Yu:
>> >>>> With positive results, more hbase users would be asking for the
>> backport
>> >>>> of offheap read path patches.
>> >>>>
>> >>>> Do you think you or your coworker has the bandwidth to publish
>> backport
>> >>>> for branch-1 ?
>> >>>>
>> >>>> Thanks
>> >>>>
>> >>>>> On Nov 18, 2016, at 12:11 AM, Yu Li <carp84@gmail.com>
wrote:
>> >>>>>
>> >>>>> Dear all,
>> >>>>>
>> >>>>> We have backported read path offheap (HBASE-11425) to our customized
>> >>>> hbase-1.1.2 (thanks @Anoop for the help/support) and run it online
for
>> >> more
>> >>>> than a month, and would like to share our experience, for what it's
>> >> worth
>> >>>> (smile).
>> >>>>>
>> >>>>> Generally speaking, we gained a better and more stable
>> >>>> throughput/performance with offheap, and below are some details:
>> >>>>> 1. QPS become more stable with offheap
>> >>>>>
>> >>>>> Performance w/o offheap:
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> Performance w/ offheap:
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> These data come from our online A/B test cluster (with 450 physical
>> >>>> machines, and each with 256G memory + 64 core) with real world
>> >> workloads,
>> >>>> it shows using offheap we could gain a more stable throughput as
well
>> as
>> >>>> better performance
>> >>>>>
>> >>>>> Not showing fully online data here because for online we published
>> the
>> >>>> version with both offheap and NettyRpcServer together, so no
>> standalone
>> >>>> comparison data for offheap
>> >>>>>
>> >>>>> 2. Full GC frequency and cost
>> >>>>>
>> >>>>> Average Full GC STW time reduce from 11s to 7s with offheap.
>> >>>>>
>> >>>>> 3. Young GC frequency and cost
>> >>>>>
>> >>>>> No performance degradation observed with offheap.
>> >>>>>
>> >>>>> 4. Peak throughput of one single RS
>> >>>>>
>> >>>>> On Singles Day (11/11), peak throughput of one single RS reached
>> 100K,
>> >>>> among which 90K from Get. Plus internet in/out data we could know
the
>> >>>> average result size of get request is ~1KB
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> Offheap are used on all online machines (more than 1600 nodes)
>> instead
>> >>>> of LruCache, so the above QPS is gained from offheap bucketcache,
>> along
>> >>>> with NettyRpcServer(HBASE-15756).
>> >>>>>
>> >>>>> Just let us know if any comments. Thanks.
>> >>>>>
>> >>>>> Best Regards,
>> >>>>> Yu
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>>
>> >>
>>

Mime
View raw message