hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10656) high-scale-lib's Counter depends on Oracle (Sun) JRE, and also has some bug
Date Tue, 18 Oct 2016 21:43:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586749#comment-15586749

stack commented on HBASE-10656:

[~ikeda] Here is an interesting observation by a coworker [~misha@cloudera.com]. I can open
new issue to discuss but posting here for moment:

To induce high load on MONITORING TOOL in my small 8-machine cluster, V suggested to create
10 hbase tables with 1K regions each - in this way, MONITORING TOOL gets 10K new entities
to monitor. I've done that and it worked for MONITORING TOOL as expected. However, one thing
that we noticed is that HBase Region Servers in my cluster are now constantly running GC....

I decided to take a quick look, took a heap dump from one of the region servers and analyzed
it with the same tool (http://www.jxray.com) that I use in the MONITORING TOOL work. The output
is attached.

One finding is that 41% of memory is occupied by instances of org.apache.hadoop.hbase.util.Counter$Cell
class, and they seem to be actively "churned" by GC all the time. I looked at the code of
this class, and one thing that immediately caught my eye is this:

  private static class Cell {
    // Pads are added around the value to avoid cache-line contention with
    // another cell's value. The cache-line size is expected to be equal to or
    // less than about 128 Bytes (= 64 Bits * 16).

    volatile long p0, p1, p2, p3, p4, p5, p6;
    volatile long value;
    volatile long q0, q1, q2, q3, q4, q5, q6;

So, as far as I understand, the only meaningful data field in this class, 'value', is deliberately
"padded" with empty fields just to make an instance of this class big enough to fit the entire
128-byte cache line.

This looks like a very extreme optimization that would work if there were very few objects
in memory, or at least very few of Counter$Cell instances, so that they were kept in the cache
all the time. But clearly in our case making these objects artificially large greatly increases
the GC pressure and ultimately makes everything much slower.

Can somebody shed some light on this? In particular:

- Why do so many Counter instances are created and destroyed all the time despite the fact
that there is no HBase activity going on?
- I don't think the setup with 10K regions is very unconventional. If so many Cell objects
need to be maintained, then probably it's worth providing e.g. another implementation that's
simply optimized for size rather than for memory cache performance?

On Question #1, it is probably our metrics accounting that is going on. On #2, you might have

>  high-scale-lib's Counter depends on Oracle (Sun) JRE, and also has some bug
> ----------------------------------------------------------------------------
>                 Key: HBASE-10656
>                 URL: https://issues.apache.org/jira/browse/HBASE-10656
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hiroshi Ikeda
>            Assignee: Hiroshi Ikeda
>            Priority: Minor
>             Fix For: 0.96.2, 0.98.1, 0.99.0
>         Attachments: 10656-098.v2.txt, 10656-trunk.v2.patch, 10656.096v2.txt, HBASE-10656-0.96.patch,
HBASE-10656-addition.patch, HBASE-10656-trunk.patch, MyCounter.java, MyCounter2.java, MyCounter3.java,
MyCounterTest.java, MyCounterTest.java, PerformanceTestApp.java, PerformanceTestApp2.java,
output.pdf, output.txt, output2.pdf, output2.txt
> Cliff's high-scale-lib's Counter is used in important classes (for example, HRegion)
in HBase, but Counter uses sun.misc.Unsafe, that is implementation detail of the Java standard
library and belongs to Oracle (Sun). That consequently makes HBase depend on the specific
JRE Implementation.
> To make matters worse, Counter has a bug and you may get wrong result if you mix a reading
method into your logic calling writing methods.
> In more detail, I think the bug is caused by reading an internal array field without
resolving memory caching, which is intentional the comment says, but storing the read result
into a volatile field. That field may be not changed after you can see the true values of
the array field, and also may be not changed after updating the "next" CAT instance's values
in some race condition when extending CAT instance chain.
> Anyway, it is possible that you create a new alternative class which only depends on
the standard library. I know Java8 provides its alternative, but HBase should support Java6
and Java7 for some time.

This message was sent by Atlassian JIRA

View raw message