Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 913DA10ED1 for ; Wed, 18 Dec 2013 19:09:08 +0000 (UTC) Received: (qmail 10137 invoked by uid 500); 18 Dec 2013 19:09:08 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 10095 invoked by uid 500); 18 Dec 2013 19:09:08 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 10086 invoked by uid 99); 18 Dec 2013 19:09:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Dec 2013 19:09:08 +0000 Date: Wed, 18 Dec 2013 19:09:08 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10191) Move large arena storage off heap MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10191?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D138= 52031#comment-13852031 ]=20 Andrew Purtell commented on HBASE-10191: ---------------------------------------- bq. What HBase version are you using? No bucket cache yet? Trunk, what is now 0.98.=20 As you point out above, serialization/deserialization costs limit the bucke= t cache, which is why I propose the goal of direct operation on allocations= backed by off-heap memory. This has to be approached in stages.=20 The bucket cache encourages looking at this approach. Although you'll see r= educed throughput, it will smooth out the latency tail and allow the blockc= ache to address RAM without increasing heap size, which also helps smooth o= ut the latency tail with respect to collection pause distribution. However,= using large heaps e.g. 128+ GB mixed generation collections exceeding the = ZooKeeper heartbeat timeout are inevitable under mixed read+write load, not= hing mitigates that sufficiently that I have found.=20 > Move large arena storage off heap > --------------------------------- > > Key: HBASE-10191 > URL: https://issues.apache.org/jira/browse/HBASE-10191 > Project: HBase > Issue Type: Umbrella > Reporter: Andrew Purtell > > Umbrella issue for moving large arena storage off heap. > Even with the improved G1 GC in Java 7, Java processes that want to addre= ss large regions of memory while also providing low high-percentile latenci= es continue to be challenged. Fundamentally, a Java server process that has= high data throughput and also tight latency SLAs will be stymied by the fa= ct that the JVM does not provide a fully concurrent collector. There is sim= ply not enough throughput to copy data during GC under safepoint (all appli= cation threads suspended) within available time bounds. This is increasingl= y an issue for HBase users operating under dual pressures: 1. tight respons= e SLAs, 2. the increasing amount of RAM available in "commodity" server con= figurations, because GC load is roughly proportional to heap size. > We can address this using parallel strategies. We should talk with the Ja= va platform developer community about the possibility of a fully concurrent= collector appearing in OpenJDK somehow. Set aside the question of if this = is too little too late, if one becomes available the benefit will be immedi= ate though subject to qualification for production, and transparent in term= s of code changes. However in the meantime we need an answer for Java versi= ons already in production. This requires we move the large arena allocation= s off heap, those being the blockcache and memstore. On other JIRAs recentl= y there has been related discussion about combining the blockcache and mems= tore (HBASE-9399) and on flushing memstore into blockcache (HBASE-5311), wh= ich is related work. We should build off heap allocation for memstore and b= lockcache, perhaps a unified pool for both, and plumb through zero copy dir= ect access to these allocations (via direct buffers) through the read and w= rite I/O paths. This may require the construction of classes that provide o= bject views over data contained within direct buffers. This is something el= se we could talk with the Java platform developer community about - it coul= d be possible to provide language level object views over off heap memory, = on heap objects could hold references to objects backed by off heap memory = but not vice versa, maybe facilitated by new intrinsics in Unsafe. Again we= need an answer for today also. We should investigate what existing librari= es may be available in this regard. Key will be avoiding marshalling/unmars= halling costs. At most we should be copying primitives out of the direct bu= ffers to register or stack locations until finally copying data to construc= t protobuf Messages. A related issue there is HBASE-9794, which proposes sc= atter-gather access to KeyValues when constructing RPC messages. We should = see how far we can get with that and also zero copy construction of protobu= f Messages backed by direct buffer allocations. Some amount of native code = may be required. -- This message was sent by Atlassian JIRA (v6.1.4#6159)