hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <enis....@gmail.com>
Subject Re: Use experience and performance data of offheap from Alibaba online cluster
Date Fri, 18 Nov 2016 19:02:48 GMT
Thanks for sharing this. Great work.

I don't see any reason why we cannot backport to branch-1.

Enis

On Fri, Nov 18, 2016 at 9:37 AM, Andrew Purtell <andrew.purtell@gmail.com>
wrote:

> Yes, please, the patches will be useful to the community even if we decide
> not to backport into an official 1.x release.
>
>
> > On Nov 18, 2016, at 12:25 PM, Bryan Beaudreault <
> bbeaudreault@hubspot.com> wrote:
> >
> > Is the backported patch available anywhere? Not seeing it on the
> referenced
> > JIRA. If it ends up not getting officially backported to branch-1 due to
> > 2.0 around the corner, some of us who build our own deploy may want to
> > integrate into our builds. Thanks! These numbers look great
> >
> >> On Fri, Nov 18, 2016 at 12:20 PM Anoop John <anoop.hbase@gmail.com>
> wrote:
> >>
> >> Hi Yu Li
> >>               Good to see that the off heap work help you..  The perf
> >> numbers looks great.  So this is a compare of on heap L1 cache vs off
> heap
> >> L2 cache(HBASE-11425 enabled).   So for 2.0 we should make L2 off heap
> >> cache ON by default I believe.  Will raise a jira for that we can
> discuss
> >> under that.   Seems like L2 off heap cache for data blocks and L1 cache
> for
> >> index blocks seems a right choice.
> >>
> >> Thanks for the backport and the help in testing the feature..  You were
> >> able to find some corner case bugs and helped community to fix them..
> >> Thanks goes to ur whole team.
> >>
> >> -Anoop-
> >>
> >>
> >>> On Fri, Nov 18, 2016 at 10:14 PM, Yu Li <carp84@gmail.com> wrote:
> >>>
> >>> Sorry guys, let me retry the inline images:
> >>>
> >>> Performance w/o offheap:
> >>>
> >>> ​
> >>> Performance w/ offheap:
> >>>
> >>> ​
> >>> Peak Get QPS of one single RS during Singles' Day (11/11):
> >>>
> >>> ​
> >>>
> >>> And attach the files in case inline still not working:
> >>> ​​​
> >>> Performance_without_offheap.png
> >>> <
> >> https://drive.google.com/file/d/0B017Q40_F5uwbWEzUGktYVIya3JkcXVjRkFvVG
> NtM0VxWC1n/view?usp=drive_web
> >>>
> >>> ​​
> >>> Performance_with_offheap.png
> >>> <
> >> https://drive.google.com/file/d/0B017Q40_F5uweGR2cnJEU0M1MWwtRFJ5YkxUeF
> VrcUdPc2ww/view?usp=drive_web
> >>>
> >>> ​​
> >>> Peak_Get_QPS_of_Single_RS.png
> >>> <
> >> https://drive.google.com/file/d/0B017Q40_F5uwQ2FkR2k0ZmEtRVNGSFp5RUxHM3
> F6bHpNYnJz/view?usp=drive_web
> >>>
> >>> ​
> >>>
> >>>
> >>> Best Regards,
> >>> Yu
> >>>
> >>>> On 18 November 2016 at 19:29, Ted Yu <yuzhihong@gmail.com> wrote:
> >>>>
> >>>> Yu:
> >>>> With positive results, more hbase users would be asking for the
> backport
> >>>> of offheap read path patches.
> >>>>
> >>>> Do you think you or your coworker has the bandwidth to publish
> backport
> >>>> for branch-1 ?
> >>>>
> >>>> Thanks
> >>>>
> >>>>> On Nov 18, 2016, at 12:11 AM, Yu Li <carp84@gmail.com> wrote:
> >>>>>
> >>>>> Dear all,
> >>>>>
> >>>>> We have backported read path offheap (HBASE-11425) to our customized
> >>>> hbase-1.1.2 (thanks @Anoop for the help/support) and run it online for
> >> more
> >>>> than a month, and would like to share our experience, for what it's
> >> worth
> >>>> (smile).
> >>>>>
> >>>>> Generally speaking, we gained a better and more stable
> >>>> throughput/performance with offheap, and below are some details:
> >>>>> 1. QPS become more stable with offheap
> >>>>>
> >>>>> Performance w/o offheap:
> >>>>>
> >>>>>
> >>>>>
> >>>>> Performance w/ offheap:
> >>>>>
> >>>>>
> >>>>>
> >>>>> These data come from our online A/B test cluster (with 450 physical
> >>>> machines, and each with 256G memory + 64 core) with real world
> >> workloads,
> >>>> it shows using offheap we could gain a more stable throughput as well
> as
> >>>> better performance
> >>>>>
> >>>>> Not showing fully online data here because for online we published
> the
> >>>> version with both offheap and NettyRpcServer together, so no
> standalone
> >>>> comparison data for offheap
> >>>>>
> >>>>> 2. Full GC frequency and cost
> >>>>>
> >>>>> Average Full GC STW time reduce from 11s to 7s with offheap.
> >>>>>
> >>>>> 3. Young GC frequency and cost
> >>>>>
> >>>>> No performance degradation observed with offheap.
> >>>>>
> >>>>> 4. Peak throughput of one single RS
> >>>>>
> >>>>> On Singles Day (11/11), peak throughput of one single RS reached
> 100K,
> >>>> among which 90K from Get. Plus internet in/out data we could know the
> >>>> average result size of get request is ~1KB
> >>>>>
> >>>>>
> >>>>>
> >>>>> Offheap are used on all online machines (more than 1600 nodes)
> instead
> >>>> of LruCache, so the above QPS is gained from offheap bucketcache,
> along
> >>>> with NettyRpcServer(HBASE-15756).
> >>>>>
> >>>>> Just let us know if any comments. Thanks.
> >>>>>
> >>>>> Best Regards,
> >>>>> Yu
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message