hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function
Date Fri, 25 Mar 2011 20:51:06 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011414#comment-13011414
] 

stack commented on HBASE-3694:
------------------------------

Patch looks good but I stumble when I come to this:

{code}
+  /**
+   * @return the global mem store size in the region server
+   */
+  public AtomicLong getGlobalMemstoreSize();
{code}

Here we are adding the getting of a single value to the RSS Interface.  RSS is usually about
more macro-type services than single data member value.  Rare would the user of RSS be interested
in this single value.  More useful i'd think would be if the RSS returned a class that allowed
client a (read-only) view on multiple RS values; e.g. Above there is talk of a MemoryAccountingManager
which I imagine would have this memstore size among other values.

We could change getRpcMetrics to be a generic getMetrics and it would return a RegionServerMetrics
instance taht would include instance of HBaseRpcMetrics and current state of above counter?





> high multiput latency due to checking global mem store size in a synchronized function
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-3694
>                 URL: https://issues.apache.org/jira/browse/HBASE-3694
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>         Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch,
Hbase-3694[r1085508]_4.patch
>
>
> The problem is we found the multiput latency is very high.
> In our case, we have almost 22 Regions in each RS and there are no flush happened during
these puts.
> After investigation, we believe that the root cause is the function getGlobalMemStoreSize,
which is to check the high water mark of mem store. 
> This function takes almost 40% of total execution time of multiput when instrumenting
some metrics in the code.  
> The actual percentage may be more higher. The execution time is spent on synchronize
contention.
> One solution is to keep a static var in HRegion to keep the global MemStore size instead
of calculating them every time.
> Why using static variable?
> Since all the HRegion objects in the same JVM share the same memory heap, they need to
share fate as well.
> The static variable, globalMemStroeSize, naturally shows the total mem usage in this
shared memory heap for this JVM.
> If multiple RS need to run in the same JVM, they still need only one globalMemStroeSize.
> If multiple RS run on different JVMs, everything is fine.
> After changing, in our cases, the avg multiput latency decrease from 60ms to 10ms.
> I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message