Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AA449181AA for ; Thu, 22 Oct 2015 19:11:29 +0000 (UTC) Received: (qmail 43063 invoked by uid 500); 22 Oct 2015 19:11:28 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 42990 invoked by uid 500); 22 Oct 2015 19:11:28 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 42751 invoked by uid 99); 22 Oct 2015 19:11:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Oct 2015 19:11:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id F0FDD2C1F81 for ; Thu, 22 Oct 2015 19:11:27 +0000 (UTC) Date: Thu, 22 Oct 2015 19:11:27 +0000 (UTC) From: "Yu Li (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-14463) Severe performance downgrade when parallel reading a single key from BucketCache MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-14463: -------------------------- Attachment: pe_use_same_keys.patch After supporting record/load keys for randomReads in PE tool (code changes refer to the attached patch), recheck the performance with --multiGet=100 and 25 threads, results are as follows: {noformat} w/o patch: 1. Min: 94220ms Max: 95193ms Avg: 94826ms 2. Min: 91405ms Max: 92271ms Avg: 91955ms 3. Min: 95314ms Max: 96266ms Avg: 95946ms 4. Min: 95545ms Max: 96534ms Avg: 96208ms Average: 94733.75ms w/ patch: 1. Min: 94887ms Max: 95890ms Avg: 95561ms 2. Min: 94681ms Max: 95643ms Avg: 95285ms 3. Min: 93880ms Max: 94856ms Avg: 94514ms 4. Min: 93418ms Max: 94283ms Avg: 93981ms Average: 94835.25ms {noformat} The correlated BucketCache status: {noformat} w/o patch: 1. Hits Caching 18,821,913; Misses Caching 11,595 2. Hits Caching 18,821,913; Misses Caching 11,588 3. Hits Caching 18,821,913; Misses Caching 11,586 4. Hits Caching 18,821,913; Misses Caching 11,587 w/ patch: 1. Hits Caching 18,821,913; Misses Caching 11,586 2. Hits Caching 18,821,913; Misses Caching 11,590 3. Hits Caching 18,821,913; Misses Caching 11,587 4. Hits Caching 18,821,913; Misses Caching 11,588 {noformat} We could see no more perf downgrade (~0.1%). [~anoop.hbase], [~ram_krish], [~jingcheng.du@intel.com], [~lhofhansl] and [~ikeda], does this latest test result make sense to you? Or any comments? Thanks. > Severe performance downgrade when parallel reading a single key from BucketCache > -------------------------------------------------------------------------------- > > Key: HBASE-14463 > URL: https://issues.apache.org/jira/browse/HBASE-14463 > Project: HBase > Issue Type: Bug > Affects Versions: 0.98.14, 1.1.2 > Reporter: Yu Li > Assignee: Yu Li > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.16 > > Attachments: GC_with_WeakObjectPool.png, HBASE-14463.patch, HBASE-14463_v11.patch, HBASE-14463_v12.patch, HBASE-14463_v2.patch, HBASE-14463_v3.patch, HBASE-14463_v4.patch, HBASE-14463_v5.patch, TestBucketCache-new_with_IdLock.png, TestBucketCache-new_with_IdReadWriteLock.png, TestBucketCache_with_IdLock-latest.png, TestBucketCache_with_IdLock.png, TestBucketCache_with_IdReadWriteLock-latest.png, TestBucketCache_with_IdReadWriteLock-resolveLockLeak.png, TestBucketCache_with_IdReadWriteLock.png, pe_use_same_keys.patch, test-results.tar.gz > > > We store feature data of online items in HBase, do machine learning on these features, and supply the outputs to our online search engine. In such scenario we will launch hundreds of yarn workers and each worker will read all features of one item(i.e. single rowkey in HBase), so there'll be heavy parallel reading on a single rowkey. > We were using LruCache but start to try BucketCache recently to resolve gc issue, and just as titled we have observed severe performance downgrade. After some analytics we found the root cause is the lock in BucketCache#getBlock, as shown below > {code} > try { > lockEntry = offsetLock.getLockEntry(bucketEntry.offset()); > // ... > if (bucketEntry.equals(backingMap.get(key))) { > // ... > int len = bucketEntry.getLength(); > Cacheable cachedBlock = ioEngine.read(bucketEntry.offset(), len, > bucketEntry.deserializerReference(this.deserialiserMap)); > {code} > Since ioEnging.read involves array copy, it's much more time-costed than the operation in LruCache. And since we're using synchronized in IdLock#getLockEntry, parallel read dropping on the same bucket would be executed in serial, which causes a really bad performance. > To resolve the problem, we propose to use ReentranceReadWriteLock in BucketCache, and introduce a new class called IdReadWriteLock to implement it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)