hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17747) Support both weak and soft object pool
Date Tue, 14 Mar 2017 06:07:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923639#comment-15923639

Yu Li commented on HBASE-17747:

bq. you can run faster than the queue can be cleared and so you can OOME. Might want to make
it configurable then but default it on.
If in embedded mode we cannot OOME, I don't think we can OOME in distributed mode, to be frank.
But yes, make it configurable is more flexible, let me open another JIRA to do this. Thanks.

bq. There is no 'shrink' operation for ConcurrentHashMap, so if you put 1M objects into the
map and then remove 0.99M, the table size will still be more than 1M.
So what's the harm boss? If the memory is not enough, soft reference will get cleared thus
map cleared, or else if the memory is enough, seems to be no harm to let it be? If we discuss
this in theory, I think javadoc description is strong enough, and if we discuss in practice,
we already made the test against both embedded and distributed mode, right?

bq. Give it a try. We need to confirm that G1 can work well.
Sorry but I'm not that familiar with G1 tuning, so I'm not sure what kind of testing against
G1 could confirm G1 could work well. And I don't think this is GC algorithm related, I mean,
what part might have issue in G1 GC but not in CMS GC?

Correct me if I'm wrong, but IMHO if there's no problem in theory, we could let the commit
in, and fix the issue if any emerged later, it seems to be the way we've been following. So
I propose to follow stack's suggestion: make it configurable for {{IdReadWriteLock}} and use
soft reference by default. Sounds good to you [~Apache9]? If we get a consensus, I will open
a new JIRA and close this one. Thanks.

btw, allow me to emphasize the fact that even in distributed mode, we have got a 5%~7% performance
enhancement with soft reference, with 256 clients querying one RS which is not a special case.
So there's benefit in "real world" if you take embedded mode as some informal case.

> Support both weak and soft object pool
> --------------------------------------
>                 Key: HBASE-17747
>                 URL: https://issues.apache.org/jira/browse/HBASE-17747
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.0
>            Reporter: Yu Li
>            Assignee: Yu Li
>             Fix For: 2.0
>         Attachments: HBASE-17747.patch, HBASE-17747.v2.patch, HBASE-17747.v3.patch
> During YCSB testing on embedded mode after HBASE-17744, we found that under high read
load GC is quite severe even with offheap L2 cache. After some investigation, we found it's
caused by using weak reference in {{IdReadWriteLock}}. In embedded mode the read is so quick
that the lock might already get promoted to the old generation when the weak reference is
cleared, which causes dirty card table (old reference get removed and new lock object set
into {{referenceCache}}, see {{WeakObjectPool#get}}) thus slowing YGC. In distributed mode
there'll also be more lock object created with weak reference than soft reference that slowing
down the processing.
> So we proposed to use soft reference for this {{IdReadWriteLock}} used in cache, which
won't get cleared until JVM memory is not enough, and could resolve the issue mentioned above.
What's more, we propose to extend the {{WeakObjectPool}} to be more generate to support both
weak and soft reference.
> Note that the GC issue only emerges under embedded mode with DirectOperator, in which
case all costs on the wire is removed thus produces extremely high concurrency.

This message was sent by Atlassian JIRA

View raw message