phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-2940) Remove STATS RPCs from rowlock
Date Mon, 20 Jun 2016 04:42:05 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338996#comment-15338996
] 

Josh Elser commented on PHOENIX-2940:
-------------------------------------

{quote}
I'd do ImmutableBytesPtr since this is the key the underlying cache is using and you can easily
access this from PTable.getName().getBytesPtr(). Otherwise you end up creating a new ImmutableBytesPtr
with every call to get. There's only a few callers of ConnectionQueryService.getTableStats(),
so I'm hoping it's not too bad.
{quote}

Yeah, I came to that one too. Glad we're in agreement :)

bq. I knew I was asking the right person about what would happen if we updated the protobuf

:)

--

The last thing I wanted to do was some performance testing. Amazingly, I actually got it done
(at least to a degree I'm mostly happy about). I used a combination of [~ndimiduk]'s (which
came from [~cartershanklin], originally) TPC-DS and Apache JMeter testing for the concurrent
read-side and Pherf for some write testing.

I had a 5-RS HBase instance (on some "crappy" VMs: 2core, 16G RAM, 1 effective disk), so the
numbers are a bit low (but the difference between them is more what we care about). I generated
about 30G of data with TPC-DS and used JMeter to run a bunch of point queries (5 JMeter clients,
8 threads per client, 1500 queries per thread). The point queries where generated using bounded,
random values, so there was a decent ratio of hits to misses. For these, I did 3 runs with
master and 3 runs with this patch. Looking at p90, p95, and p99, and median latencies across
4 different queries, there was not a significant difference in the execution of the queries.
If anything, the 2940 patch might have been slightest faster on average over master (which
makes sense because we should be reading the stats table less often and sending less data
over the wire, but given the data size, it isn't significantly shower). 

I also ran one aggregate style query between the store_sales table and the date dimension
table. The code from master was a little faster here, but I believe this may have been because
I didn't re-compact the table after switching from the code in master to the code from 2940
(the restart screwed up locality for some reason and I had to run the balancer to redistribute
the regions). In short, I did not observe a significant difference in concurrent reads with
this patch.

I captured most (hopefully all) of my automation in https://github.com/joshelser/phoenix-performance

On the write side, I used Pherf to get some concurrent writers into HBase. Across the 5 nodes,
I ingested pseudo-random data into a 10 column table with 5 salt buckets to split up the load
as much as possible. Each Pherf client wrote 5M records and each Pherf client ran at the same
time. The scenario include a validation of the ingest using a simple {{select count(..)}}
on the primary key for the table. I performed 2 runs of this on both master and the 2940 patch
 (currently finishing up run 2 on master, but I don't expect a difference). Performance appears
to be pretty equivalent across both master and the 2940 patch.

I do have the numbers here if anyone is curious about them, but, IMO, the lack of significant
difference between master and this patch is what I wanted to know to be more certain that
we aren't introducing anything dumb performance regression. I will double check the last comments
from James again with fresh eyes and then try to commit this tomorrow morning (FYI for 4.8
[~ankit.singhal]).

> Remove STATS RPCs from rowlock
> ------------------------------
>
>                 Key: PHOENIX-2940
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2940
>             Project: Phoenix
>          Issue Type: Improvement
>         Environment: HDP 2.3 + Apache Phoenix 4.6.0
>            Reporter: Nick Dimiduk
>            Assignee: Josh Elser
>             Fix For: 4.8.0
>
>         Attachments: PHOENIX-2940.001.patch, PHOENIX-2940.002.patch, PHOENIX-2940.003.patch,
PHOENIX-2940.004.patch
>
>
> We have an unfortunate situation wherein we potentially execute many RPCs while holding
a row lock. This is problem is discussed in detail on the user list thread ["Write path blocked
by MetaDataEndpoint acquiring region lock"|http://search-hadoop.com/m/9UY0h2qRaBt6Tnaz1&subj=Write+path+blocked+by+MetaDataEndpoint+acquiring+region+lock].
During some situations, the [MetaDataEndpoint|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L492]
coprocessor will attempt to refresh it's view of the schema definitions and statistics. This
involves [taking a rowlock|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2862],
executing a scan against the [local region|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L542],
and then a scan against a [potentially remote|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L964]
statistics table.
> This issue is apparently exacerbated by the use of user-provided timestamps (in my case,
the use of the ROW_TIMESTAMP feature, or perhaps as in PHOENIX-2607). When combined with other
issues (PHOENIX-2939), we end up with total gridlock in our handler threads -- everyone queued
behind the rowlock, scanning and rescanning SYSTEM.STATS. Because this happens in the MetaDataEndpoint,
the means by which all clients refresh their knowledge of schema, gridlock in that RS can
effectively stop all forward progress on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message