hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-10191) Move large arena storage off heap
Date Tue, 17 Dec 2013 23:28:08 GMT
Andrew Purtell created HBASE-10191:

             Summary: Move large arena storage off heap
                 Key: HBASE-10191
                 URL: https://issues.apache.org/jira/browse/HBASE-10191
             Project: HBase
          Issue Type: Umbrella
            Reporter: Andrew Purtell

Umbrella issue for moving large arena storage off heap.

Even with the improved G1 GC in Java 7, Java processes that want to address large regions
of memory while also providing low high-percentile latencies continue to be challenged. Fundamentally,
a Java server process that has high data throughput and also tight latency SLAs will be stymied
by the fact that the JVM does not provide a fully concurrent collector. There is simply not
enough throughput to copy data during GC under safepoint (all application threads suspended)
within available time bounds. This is increasingly an issue for HBase users operating under
dual pressures: 1. tight response SLAs, 2. the increasing amount of RAM available in "commodity"
server configurations.

We can address this using parallel strategies. We should talk with the Java platform developer
community about the possibility of a fully concurrent collector appearing in OpenJDK somehow.
Set aside the question of if this is too little too late, if one becomes available the benefit
will be immediate though subject to qualification for production, and transparent in terms
of code changes. However in the meantime we need an answer for Java versions already in production.
This requires we move the large arena allocations off heap, those being the blockcache and
memstore. On other JIRAs recently there has been related discussion about combining the blockcache
and memstore (HBASE-9399) and on flushing memstore into blockcache (HBASE-5311), which is
related work. We should build off heap allocation for memstore and blockcache, perhaps a unified
pool for both, and plumb through zero copy direct access to these allocations (via direct
buffers) through the read and write I/O paths. This may require the construction of classes
that provide object views over data contained within direct buffers. This is something else
we could talk with the Java platform developer community about - it could be possible to provide
language level object views over off heap memory, on heap objects could hold references to
objects backed by off heap memory but not vice versa, maybe facilitated by new intrinsics
in Unsafe. Again we need an answer for today also. We should investigate what existing libraries
may be available in this regard. Key will be avoiding marshalling/unmarshalling costs. At
most we should be copying primitives out of the direct buffers to register or stack locations
until finally copying data to construct protobuf Messages. A related issue there is HBASE-9794,
which proposes scatter-gather access to KeyValues when constructing RPC messages. We should
see how far we can get with that and also zero copy construction of protobuf Messages backed
by direct buffer allocations. Some amount of native code may be required.

This message was sent by Atlassian JIRA

View raw message