hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14463) Severe performance downgrade when parallel reading a single key from BucketCache
Date Wed, 21 Oct 2015 08:32:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14966438#comment-14966438
] 

Yu Li commented on HBASE-14463:
-------------------------------

Thanks all for taking a look here.

Was trying to reproduce [~anoop.hbase]/[~ram_krish]'s result and do some investigation but
met with some problem, such as jvm crash during data ingestion with PE (haven't file any JIRA
since not sure whether it's an env-specific issue) and AssertionError during multi get testing
(see HBASE-14660). Now I could get the test run after disabling assertion and will do further
debugging, will update my findings later.

[~jingcheng.du@intel.com] I also doubt about the purge call slows down the performance, will
add some threshold there and check the perf comparison. Thanks for point it out.

[~lhofhansl] we need to store the lock(entry) somewhere and using lockPool is for reducing
lock contention. I think the idea of using weak reference is good but lack of some perf testing
here before. Or any better idea please let me know :-)

> Severe performance downgrade when parallel reading a single key from BucketCache
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-14463
>                 URL: https://issues.apache.org/jira/browse/HBASE-14463
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.14, 1.1.2
>            Reporter: Yu Li
>            Assignee: Yu Li
>             Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.16
>
>         Attachments: GC_with_WeakObjectPool.png, HBASE-14463.patch, HBASE-14463_v11.patch,
HBASE-14463_v12.patch, HBASE-14463_v2.patch, HBASE-14463_v3.patch, HBASE-14463_v4.patch, HBASE-14463_v5.patch,
TestBucketCache-new_with_IdLock.png, TestBucketCache-new_with_IdReadWriteLock.png, TestBucketCache_with_IdLock-latest.png,
TestBucketCache_with_IdLock.png, TestBucketCache_with_IdReadWriteLock-latest.png, TestBucketCache_with_IdReadWriteLock-resolveLockLeak.png,
TestBucketCache_with_IdReadWriteLock.png
>
>
> We store feature data of online items in HBase, do machine learning on these features,
and supply the outputs to our online search engine. In such scenario we will launch hundreds
of yarn workers and each worker will read all features of one item(i.e. single rowkey in HBase),
so there'll be heavy parallel reading on a single rowkey.
> We were using LruCache but start to try BucketCache recently to resolve gc issue, and
just as titled we have observed severe performance downgrade. After some analytics we found
the root cause is the lock in BucketCache#getBlock, as shown below
> {code}
>       try {
>         lockEntry = offsetLock.getLockEntry(bucketEntry.offset());
>         // ...
>         if (bucketEntry.equals(backingMap.get(key))) {
>           // ...
>           int len = bucketEntry.getLength();
>           Cacheable cachedBlock = ioEngine.read(bucketEntry.offset(), len,
>               bucketEntry.deserializerReference(this.deserialiserMap));
> {code}
> Since ioEnging.read involves array copy, it's much more time-costed than the operation
in LruCache. And since we're using synchronized in IdLock#getLockEntry, parallel read dropping
on the same bucket would be executed in serial, which causes a really bad performance.
> To resolve the problem, we propose to use ReentranceReadWriteLock in BucketCache, and
introduce a new class called IdReadWriteLock to implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message