hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Heng Chen <heng.chen.1...@gmail.com>
Subject Re: Use experience and performance data of offheap from Alibaba online cluster
Date Sun, 20 Nov 2016 01:05:35 GMT
The performance looks great!

2016-11-19 18:03 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
> Opening a JIRA would be fine.
> This makes it easier for people to obtain the patch(es).
>
> Cheers
>
>> On Nov 18, 2016, at 11:35 PM, Anoop John <anoop.hbase@gmail.com> wrote:
>>
>> Because of some compatibility issues, we decide that this will be done
>> in 2.0 only..  Ya as Andy said, it would be great to share the 1.x
>> backported patches.  Is it a mega patch at ur end?  Or issue by issue
>> patches?  Latter would be best.  Pls share patches in some place and a
>> list of issues backported. I can help with verifying the issues once
>> so as to make sure we dont miss any...
>>
>> -Anoop-
>>
>>> On Sat, Nov 19, 2016 at 12:32 AM, Enis Söztutar <enis.soz@gmail.com> wrote:
>>> Thanks for sharing this. Great work.
>>>
>>> I don't see any reason why we cannot backport to branch-1.
>>>
>>> Enis
>>>
>>> On Fri, Nov 18, 2016 at 9:37 AM, Andrew Purtell <andrew.purtell@gmail.com>
>>> wrote:
>>>
>>>> Yes, please, the patches will be useful to the community even if we decide
>>>> not to backport into an official 1.x release.
>>>>
>>>>
>>>>>> On Nov 18, 2016, at 12:25 PM, Bryan Beaudreault <
>>>>> bbeaudreault@hubspot.com> wrote:
>>>>>
>>>>> Is the backported patch available anywhere? Not seeing it on the
>>>> referenced
>>>>> JIRA. If it ends up not getting officially backported to branch-1 due
to
>>>>> 2.0 around the corner, some of us who build our own deploy may want to
>>>>> integrate into our builds. Thanks! These numbers look great
>>>>>
>>>>>> On Fri, Nov 18, 2016 at 12:20 PM Anoop John <anoop.hbase@gmail.com>
>>>> wrote:
>>>>>>
>>>>>> Hi Yu Li
>>>>>>              Good to see that the off heap work help you..  The perf
>>>>>> numbers looks great.  So this is a compare of on heap L1 cache vs
off
>>>> heap
>>>>>> L2 cache(HBASE-11425 enabled).   So for 2.0 we should make L2 off
heap
>>>>>> cache ON by default I believe.  Will raise a jira for that we can
>>>> discuss
>>>>>> under that.   Seems like L2 off heap cache for data blocks and L1
cache
>>>> for
>>>>>> index blocks seems a right choice.
>>>>>>
>>>>>> Thanks for the backport and the help in testing the feature..  You
were
>>>>>> able to find some corner case bugs and helped community to fix them..
>>>>>> Thanks goes to ur whole team.
>>>>>>
>>>>>> -Anoop-
>>>>>>
>>>>>>
>>>>>>> On Fri, Nov 18, 2016 at 10:14 PM, Yu Li <carp84@gmail.com>
wrote:
>>>>>>>
>>>>>>> Sorry guys, let me retry the inline images:
>>>>>>>
>>>>>>> Performance w/o offheap:
>>>>>>>
>>>>>>>
>>>>>>> Performance w/ offheap:
>>>>>>>
>>>>>>>
>>>>>>> Peak Get QPS of one single RS during Singles' Day (11/11):
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> And attach the files in case inline still not working:
>>>>>>>
>>>>>>> Performance_without_offheap.png
>>>>>>> <
>>>>>> https://drive.google.com/file/d/0B017Q40_F5uwbWEzUGktYVIya3JkcXVjRkFvVG
>>>> NtM0VxWC1n/view?usp=drive_web
>>>>>>>
>>>>>>>
>>>>>>> Performance_with_offheap.png
>>>>>>> <
>>>>>> https://drive.google.com/file/d/0B017Q40_F5uweGR2cnJEU0M1MWwtRFJ5YkxUeF
>>>> VrcUdPc2ww/view?usp=drive_web
>>>>>>>
>>>>>>>
>>>>>>> Peak_Get_QPS_of_Single_RS.png
>>>>>>> <
>>>>>> https://drive.google.com/file/d/0B017Q40_F5uwQ2FkR2k0ZmEtRVNGSFp5RUxHM3
>>>> F6bHpNYnJz/view?usp=drive_web
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Yu
>>>>>>>
>>>>>>>> On 18 November 2016 at 19:29, Ted Yu <yuzhihong@gmail.com>
wrote:
>>>>>>>>
>>>>>>>> Yu:
>>>>>>>> With positive results, more hbase users would be asking for
the
>>>> backport
>>>>>>>> of offheap read path patches.
>>>>>>>>
>>>>>>>> Do you think you or your coworker has the bandwidth to publish
>>>> backport
>>>>>>>> for branch-1 ?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>>> On Nov 18, 2016, at 12:11 AM, Yu Li <carp84@gmail.com>
wrote:
>>>>>>>>>
>>>>>>>>> Dear all,
>>>>>>>>>
>>>>>>>>> We have backported read path offheap (HBASE-11425) to
our customized
>>>>>>>> hbase-1.1.2 (thanks @Anoop for the help/support) and run
it online for
>>>>>> more
>>>>>>>> than a month, and would like to share our experience, for
what it's
>>>>>> worth
>>>>>>>> (smile).
>>>>>>>>>
>>>>>>>>> Generally speaking, we gained a better and more stable
>>>>>>>> throughput/performance with offheap, and below are some details:
>>>>>>>>> 1. QPS become more stable with offheap
>>>>>>>>>
>>>>>>>>> Performance w/o offheap:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Performance w/ offheap:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> These data come from our online A/B test cluster (with
450 physical
>>>>>>>> machines, and each with 256G memory + 64 core) with real
world
>>>>>> workloads,
>>>>>>>> it shows using offheap we could gain a more stable throughput
as well
>>>> as
>>>>>>>> better performance
>>>>>>>>>
>>>>>>>>> Not showing fully online data here because for online
we published
>>>> the
>>>>>>>> version with both offheap and NettyRpcServer together, so
no
>>>> standalone
>>>>>>>> comparison data for offheap
>>>>>>>>>
>>>>>>>>> 2. Full GC frequency and cost
>>>>>>>>>
>>>>>>>>> Average Full GC STW time reduce from 11s to 7s with offheap.
>>>>>>>>>
>>>>>>>>> 3. Young GC frequency and cost
>>>>>>>>>
>>>>>>>>> No performance degradation observed with offheap.
>>>>>>>>>
>>>>>>>>> 4. Peak throughput of one single RS
>>>>>>>>>
>>>>>>>>> On Singles Day (11/11), peak throughput of one single
RS reached
>>>> 100K,
>>>>>>>> among which 90K from Get. Plus internet in/out data we could
know the
>>>>>>>> average result size of get request is ~1KB
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Offheap are used on all online machines (more than 1600
nodes)
>>>> instead
>>>>>>>> of LruCache, so the above QPS is gained from offheap bucketcache,
>>>> along
>>>>>>>> with NettyRpcServer(HBASE-15756).
>>>>>>>>>
>>>>>>>>> Just let us know if any comments. Thanks.
>>>>>>>>>
>>>>>>>>> Best Regards,
>>>>>>>>> Yu
>>>>

Mime
View raw message