hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiroshi Ikeda (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-10656) high-scale-lib's Counter depends on Oracle (Sun) JRE, and also has some bug
Date Thu, 20 Mar 2014 08:32:46 GMT

     [ https://issues.apache.org/jira/browse/HBASE-10656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Hiroshi Ikeda updated HBASE-10656:

    Attachment: HBASE-10656-addition.patch

I'm a little worried about that the new Counter may consume too much memory. The Counter usually
expands its internal table to distinguish threads' IDs. That means, for example, using two
threads with ID=0 and 256 requires 512 length for the table, and each element uses about 16
long variables, and the table consequently consumes about 8192 long variable (= 64KB) for
only just one counter. This can become worse for using ID=0 and 512, or ID=0 and 1024, etc.

(However, I think high-scale-lib's Counter usually consume more memory than the new Counter,
because using random hash has more chances to collide than using sequential one.)

After some trials I has concluded that we have only 3 methods to give a different index (or
something) to a thread without synchronization,
(1) Thread.getId(),
(2) Object.hashCode() (or System.identityHashCode() for more safety), and
(3) ThreadLocal. 
Only using (3) can change the index, and we require it instead of extending the internal table.
Performance cost of using ThreadLocal is not trivial comparing to the main logic (just increment),
but we have no choice.

I attached files:
MyCounter3.java, which is an alternative Counter,
PerformanceTestApp2.java and its result files.
Also I attached the additional patch for the case that we decide to adopt it.

According to the result of the performance test, for increment, MyCounter3 requires more 30-40%
overhead than MyCounter2, which is also slower than high-scale-lib's Counter when using a
lot of threads. I could not recover the overhead of ThreadLocal after all.

(For getting a value, MyCounter3 is faster than MyCounter2 when using a lot of threads because
the internal table to traverse to sum values is smaller.)

I am not sure whether the patch should be applied or not.

By the way, Java8 is recently released and I tried and was surprised that, LongAdder is much
more faster than any counters, without unlimitedly consuming memory. I wanted to know the
trick, but I just found that it uses Unsafe to directly access Thread's newly added instance
variable to store thread-local information bypassing ThreadLocal. That's cheat! :(

>  high-scale-lib's Counter depends on Oracle (Sun) JRE, and also has some bug
> ----------------------------------------------------------------------------
>                 Key: HBASE-10656
>                 URL: https://issues.apache.org/jira/browse/HBASE-10656
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hiroshi Ikeda
>            Assignee: Hiroshi Ikeda
>            Priority: Minor
>             Fix For: 0.96.2, 0.98.1, 0.99.0
>         Attachments: 10656-098.v2.txt, 10656-trunk.v2.patch, 10656.096v2.txt, HBASE-10656-0.96.patch,
HBASE-10656-addition.patch, HBASE-10656-trunk.patch, MyCounter.java, MyCounter2.java, MyCounter3.java,
MyCounterTest.java, MyCounterTest.java, PerformanceTestApp.java, PerformanceTestApp2.java,
output.pdf, output.txt, output2.pdf, output2.txt
> Cliff's high-scale-lib's Counter is used in important classes (for example, HRegion)
in HBase, but Counter uses sun.misc.Unsafe, that is implementation detail of the Java standard
library and belongs to Oracle (Sun). That consequently makes HBase depend on the specific
JRE Implementation.
> To make matters worse, Counter has a bug and you may get wrong result if you mix a reading
method into your logic calling writing methods.
> In more detail, I think the bug is caused by reading an internal array field without
resolving memory caching, which is intentional the comment says, but storing the read result
into a volatile field. That field may be not changed after you can see the true values of
the array field, and also may be not changed after updating the "next" CAT instance's values
in some race condition when extending CAT instance chain.
> Anyway, it is possible that you create a new alternative class which only depends on
the standard library. I know Java8 provides its alternative, but HBase should support Java6
and Java7 for some time.

This message was sent by Atlassian JIRA

View raw message