hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
Date Sat, 27 Oct 2012 04:23:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485348#comment-13485348

Lars Hofhansl commented on HBASE-5898:

Something like this. I don't think the patch works around or even affects the 2nd issue.

The most likely explanation for the 2nd issue seems to be HDFS slowness:
The block is not in the cache, the first thread tries to load it, and while that is happening
all other threads have to (and should) wait.
If there is a temporary network hickup that will take a bit... And it would look exactly like
these stack traces, where many threads are queued up behind this lock.

Now that you say it, though... On further consideration I am not sure I buy there even *is*
a contention issue here. We either:
* have the block in the cache, in which case we'll return it very quickly.
* do not have the block, in that case we have to load it and all other threads must wait.

> Consider double-checked locking for block cache lock
> ----------------------------------------------------
>                 Key: HBASE-5898
>                 URL: https://issues.apache.org/jira/browse/HBASE-5898
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance
>    Affects Versions: 0.94.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.94.3, 0.96.0
>         Attachments: 5898-TestBlocksRead.txt, HBASE-5898-0.patch, HBASE-5898-1.patch,
> Running a workload with a high query rate against a dataset that fits in cache, I saw
a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even
though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote
a quick patch to switch to a double-checked locking and it improved throughput substantially
for this workload.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message