hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth Srungarapu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13420) RegionEnvironment.offerExecutionLatency Blocks Threads under Heavy Load
Date Wed, 08 Apr 2015 03:50:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484664#comment-14484664

Srikanth Srungarapu commented on HBASE-13420:

I think this metric is way to broad to be coherent. Is it the latency on a postRegionOperation
call or a prePut on the observer?

The definition of the metric would be: The first N (100) latencies from any possible coprocessor
call for a specific Region Observer refreshed every 45 seconds. Still working on a clever
The main intent behind this change is to provide useful information(time taken by all the
pre and post hooks) on per-coprocessor basis. Please take a look [here|http://hbase.apache.org/book.html#_monitor_time_spent_in_coprocessors]
for more details.

Would it make sense to build an actual bean for each of the observers that actually reports
real metrics and is registered in jmx following the signature of the observer? 
This change is geared more towards operations folks to do a quick dirty check whether there
are any anomalies introduced by coprocessor modules. Did you notice this perf impact as part
of typical average workload or while doing some sort of stress testing? I liked Andrew's patch
as it is takes the middle path. What do you think of it?

We clearly need a short term fix, but I am concerned we are continuing a metric that really
serves no purpose.
As Andrew already stated, we clearly can't pull this from 0.98 or 1.0. So, I'm thinking we
can add conf parameter which defaults to true. In your case, you might want to turn it off.

> RegionEnvironment.offerExecutionLatency Blocks Threads under Heavy Load
> -----------------------------------------------------------------------
>                 Key: HBASE-13420
>                 URL: https://issues.apache.org/jira/browse/HBASE-13420
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: John Leach
>            Assignee: Andrew Purtell
>         Attachments: HBASE-13420.patch, HBASE-13420.txt, offerExecutionLatency.tiff
>   Original Estimate: 3h
>  Remaining Estimate: 3h
> The ArrayBlockingQueue blocks threads for 20s during a performance run focusing on creating
numerous small scans.  
> I see a buffer size of (100)
>     private final BlockingQueue<Long> coprocessorTimeNanos = new ArrayBlockingQueue<Long>(
> and then I see a drain coming from
>          MetricsRegionWrapperImpl with 45 second executor
>          HRegionMetricsWrapperRunable
>          RegionCoprocessorHost#getCoprocessorExecutionStatistics()   
>          RegionCoprocessorHost#getExecutionLatenciesNanos()
> Am I missing something?

This message was sent by Atlassian JIRA

View raw message