hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiroshi Ikeda (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14708) Use copy on write Map for region location cache
Date Tue, 17 Nov 2015 11:34:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008517#comment-15008517
] 

Hiroshi Ikeda commented on HBASE-14708:
---------------------------------------

Certainly it is not trivial to think out a rule to make consistent between counter, internal
concurrent map and its array cache, with just using CAS. That has been done with encapsulating
the details in some methods, and it is clear how to implement. It is also convenient that
you can anytime use the internal concurrent map for read operations.

Accessing the counter with multiple threads cause overhead, and it is observable in the benchmark
result, but it is still 7.5 times faster than copy-on-write array implementation. As to read
operations, once a read cache is created there is no difference except overhead of calling
a method (which can be optimized by VM, I think) and additional null check.

> Use copy on write Map for region location cache
> -----------------------------------------------
>
>                 Key: HBASE-14708
>                 URL: https://issues.apache.org/jira/browse/HBASE-14708
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client
>    Affects Versions: 1.1.2
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>            Priority: Critical
>             Fix For: 2.0.0, 1.2.0, 1.3.0
>
>         Attachments: HBASE-14708-v10.patch, HBASE-14708-v11.patch, HBASE-14708-v12.patch,
HBASE-14708-v13.patch, HBASE-14708-v15.patch, HBASE-14708-v16.patch, HBASE-14708-v17.patch,
HBASE-14708-v2.patch, HBASE-14708-v3.patch, HBASE-14708-v4.patch, HBASE-14708-v5.patch, HBASE-14708-v6.patch,
HBASE-14708-v7.patch, HBASE-14708-v8.patch, HBASE-14708-v9.patch, HBASE-14708.patch, anotherbench.zip,
anotherbench2.zip, location_cache_times.pdf, result.csv
>
>
> Internally a co-worker profiled their application that was talking to HBase. > 60%
of the time was spent in locating a region. This was while the cluster was stable and no regions
were moving.
> To figure out if there was a faster way to cache region location I wrote up a benchmark
here: https://github.com/elliottneilclark/benchmark-hbase-cache
> This tries to simulate a heavy load on the location cache. 
> * 24 different threads.
> * 2 Deleting location data
> * 2 Adding location data
> * Using floor to get the result.
> To repeat my work just run ./run.sh and it should produce a result.csv
> Results:
> ConcurrentSkiplistMap is a good middle ground. It's got equal speed for reading and writing.
> However most operations will not need to remove or add a region location. There will
be potentially several orders of magnitude more reads for cached locations than there will
be on clearing the cache.
> So I propose a copy on write tree map.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message