Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 18 Nov 2015 00:54:11 +0000 (UTC)
From: "Elliott Clark (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12908273.1445962650000.100117.1447808051057@Atlassian.JIRA>
In-Reply-To: <JIRA.12908273.1445962650000@Atlassian.JIRA>
References: <JIRA.12908273.1445962650000@Atlassian.JIRA>
 <JIRA.12908273.1445962650985@arcas>
Subject: [jira] [Commented] (HBASE-14708) Use copy on write Map for region
 location cache
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-14708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009960#comment-15009960 ] 

Elliott Clark commented on HBASE-14708:
---------------------------------------

If we need to we can always go with a sharded set of arrays. That will mean that each write will only do 1/N of the copies and array holder gets a little more complex. However I don't expect it to be too pressing as even people with very large tables aren't talking to 100k regions from one client.

> Use copy on write Map for region location cache
> -----------------------------------------------
>
>                 Key: HBASE-14708
>                 URL: https://issues.apache.org/jira/browse/HBASE-14708
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client
>    Affects Versions: 1.1.2
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>            Priority: Critical
>             Fix For: 2.0.0, 1.2.0, 1.3.0
>
>         Attachments: HBASE-14708-v10.patch, HBASE-14708-v11.patch, HBASE-14708-v12.patch, HBASE-14708-v13.patch, HBASE-14708-v15.patch, HBASE-14708-v16.patch, HBASE-14708-v17.patch, HBASE-14708-v2.patch, HBASE-14708-v3.patch, HBASE-14708-v4.patch, HBASE-14708-v5.patch, HBASE-14708-v6.patch, HBASE-14708-v7.patch, HBASE-14708-v8.patch, HBASE-14708-v9.patch, HBASE-14708.patch, anotherbench.zip, anotherbench2.zip, location_cache_times.pdf, result.csv
>
>
> Internally a co-worker profiled their application that was talking to HBase. > 60% of the time was spent in locating a region. This was while the cluster was stable and no regions were moving.
> To figure out if there was a faster way to cache region location I wrote up a benchmark here: https://github.com/elliottneilclark/benchmark-hbase-cache
> This tries to simulate a heavy load on the location cache. 
> * 24 different threads.
> * 2 Deleting location data
> * 2 Adding location data
> * Using floor to get the result.
> To repeat my work just run ./run.sh and it should produce a result.csv
> Results:
> ConcurrentSkiplistMap is a good middle ground. It's got equal speed for reading and writing.
> However most operations will not need to remove or add a region location. There will be potentially several orders of magnitude more reads for cached locations than there will be on clearing the cache.
> So I propose a copy on write tree map.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)