Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7136C200C45 for ; Tue, 14 Mar 2017 07:07:47 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 6FC8E160B85; Tue, 14 Mar 2017 06:07:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BB8AE160B5D for ; Tue, 14 Mar 2017 07:07:46 +0100 (CET) Received: (qmail 73486 invoked by uid 500); 14 Mar 2017 06:07:45 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 73475 invoked by uid 99); 14 Mar 2017 06:07:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Mar 2017 06:07:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 5B3F9C036E for ; Tue, 14 Mar 2017 06:07:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.651 X-Spam-Level: X-Spam-Status: No, score=0.651 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id Oo5CKvivWWDp for ; Tue, 14 Mar 2017 06:07:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 541105F659 for ; Tue, 14 Mar 2017 06:07:43 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0446DE04A6 for ; Tue, 14 Mar 2017 06:07:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B3D78243A8 for ; Tue, 14 Mar 2017 06:07:41 +0000 (UTC) Date: Tue, 14 Mar 2017 06:07:41 +0000 (UTC) From: "Yu Li (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17747) Support both weak and soft object pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 14 Mar 2017 06:07:47 -0000 [ https://issues.apache.org/jira/browse/HBASE-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923639#comment-15923639 ] Yu Li commented on HBASE-17747: ------------------------------- bq. you can run faster than the queue can be cleared and so you can OOME. Might want to make it configurable then but default it on. If in embedded mode we cannot OOME, I don't think we can OOME in distributed mode, to be frank. But yes, make it configurable is more flexible, let me open another JIRA to do this. Thanks. bq. There is no 'shrink' operation for ConcurrentHashMap, so if you put 1M objects into the map and then remove 0.99M, the table size will still be more than 1M. So what's the harm boss? If the memory is not enough, soft reference will get cleared thus map cleared, or else if the memory is enough, seems to be no harm to let it be? If we discuss this in theory, I think javadoc description is strong enough, and if we discuss in practice, we already made the test against both embedded and distributed mode, right? bq. Give it a try. We need to confirm that G1 can work well. Sorry but I'm not that familiar with G1 tuning, so I'm not sure what kind of testing against G1 could confirm G1 could work well. And I don't think this is GC algorithm related, I mean, what part might have issue in G1 GC but not in CMS GC? Correct me if I'm wrong, but IMHO if there's no problem in theory, we could let the commit in, and fix the issue if any emerged later, it seems to be the way we've been following. So I propose to follow stack's suggestion: make it configurable for {{IdReadWriteLock}} and use soft reference by default. Sounds good to you [~Apache9]? If we get a consensus, I will open a new JIRA and close this one. Thanks. btw, allow me to emphasize the fact that even in distributed mode, we have got a 5%~7% performance enhancement with soft reference, with 256 clients querying one RS which is not a special case. So there's benefit in "real world" if you take embedded mode as some informal case. > Support both weak and soft object pool > -------------------------------------- > > Key: HBASE-17747 > URL: https://issues.apache.org/jira/browse/HBASE-17747 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.0 > Reporter: Yu Li > Assignee: Yu Li > Fix For: 2.0 > > Attachments: HBASE-17747.patch, HBASE-17747.v2.patch, HBASE-17747.v3.patch > > > During YCSB testing on embedded mode after HBASE-17744, we found that under high read load GC is quite severe even with offheap L2 cache. After some investigation, we found it's caused by using weak reference in {{IdReadWriteLock}}. In embedded mode the read is so quick that the lock might already get promoted to the old generation when the weak reference is cleared, which causes dirty card table (old reference get removed and new lock object set into {{referenceCache}}, see {{WeakObjectPool#get}}) thus slowing YGC. In distributed mode there'll also be more lock object created with weak reference than soft reference that slowing down the processing. > So we proposed to use soft reference for this {{IdReadWriteLock}} used in cache, which won't get cleared until JVM memory is not enough, and could resolve the issue mentioned above. What's more, we propose to extend the {{WeakObjectPool}} to be more generate to support both weak and soft reference. > Note that the GC issue only emerges under embedded mode with DirectOperator, in which case all costs on the wire is removed thus produces extremely high concurrency. -- This message was sent by Atlassian JIRA (v6.3.15#6346)