Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 63380E4C6 for ; Wed, 13 Feb 2013 05:30:04 +0000 (UTC) Received: (qmail 17549 invoked by uid 500); 13 Feb 2013 05:30:02 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 17351 invoked by uid 500); 13 Feb 2013 05:29:59 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 17280 invoked by uid 99); 13 Feb 2013 05:29:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Feb 2013 05:29:57 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of anoopsj@huawei.com designates 119.145.14.64 as permitted sender) Received: from [119.145.14.64] (HELO szxga01-in.huawei.com) (119.145.14.64) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Feb 2013 05:29:48 +0000 Received: from 172.24.2.119 (EHLO szxeml214-edg.china.huawei.com) ([172.24.2.119]) by szxrg01-dlp.huawei.com (MOS 4.3.4-GA FastPath queued) with ESMTP id AXN67535; Wed, 13 Feb 2013 13:29:25 +0800 (CST) Received: from szxeml459-hub.china.huawei.com (10.82.67.202) by szxeml214-edg.china.huawei.com (172.24.2.29) with Microsoft SMTP Server (TLS) id 14.1.323.7; Wed, 13 Feb 2013 13:29:22 +0800 Received: from SZXEML553-MBS.china.huawei.com ([169.254.6.181]) by szxeml459-hub.china.huawei.com ([10.82.67.202]) with mapi id 14.01.0323.007; Wed, 13 Feb 2013 13:29:20 +0800 From: Anoop Sam John To: "user@hbase.apache.org" Subject: RE: Custom preCompact RegionObserver crashes entire cluster on OOME: Heap Space Thread-Topic: Custom preCompact RegionObserver crashes entire cluster on OOME: Heap Space Thread-Index: AQHOCOR1s51AQ77Xj0eIzH9eN5tiaph1xcgr//+kroCAAIits///0v+AgAF8utg= Date: Wed, 13 Feb 2013 05:29:19 +0000 Message-ID: <0CE69E9126D0344088798A3B7F7F80863AECCCA8@szxeml553-mbs.china.huawei.com> References: <51F68F1C-6C3A-4B29-A97C-C269387FC69E@gmail.com> <0CE69E9126D0344088798A3B7F7F80863AECC345@szxeml553-mbs.china.huawei.com>,<98A8F664-6AFB-44EB-970D-71ABC8D2E34E@gmail.com> <0CE69E9126D0344088798A3B7F7F80863AECC596@szxeml553-mbs.china.huawei.com>, In-Reply-To: Accept-Language: en-US, zh-CN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.18.96.95] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Virus-Checked: Checked by ClamAV on apache.org Can you post the code in your new InternalScanner ? next() method implemen= tation. Would like to see how you are doing thie KV change -Anoop- ________________________________________ From: Mesika, Asaf [asaf.mesika@gmail.com] Sent: Tuesday, February 12, 2013 8:11 PM To: user@hbase.apache.org Subject: Re: Custom preCompact RegionObserver crashes entire cluster on OOM= E: Heap Space I'm seeing a very strange behavior: If I run a scan during major compaction, I can see both the modified Delta = Key Value (which contains the aggregated values - e.g. 9) and the other two= delta columns that were used for this aggregated column (e.g, 3, 3) - as i= f Scan is exposed to the key values produced in mid scan. Could it be related to Cache somehow? I am modifying the KeyValue object received from the InternalScanner in pre= Compact (modifying its value). On Feb 12, 2013, at 11:22 AM, Anoop Sam John wrote: >> The question is: is it "legal" to change a KV I received from the Intern= alScanner before adding it the Result - i..e returning it from my own Inter= nalScanner? > > You can change as per your need IMO > > -Anoop- > > ________________________________________ > From: Mesika, Asaf [asaf.mesika@gmail.com] > Sent: Tuesday, February 12, 2013 2:43 PM > To: user@hbase.apache.org > Subject: Re: Custom preCompact RegionObserver crashes entire cluster on O= OME: Heap Space > > I am trying to reduce the amount of KeyValue generated during the preComp= act, but I'm getting some weird behaviors. > > Let me describe what I am doing in short: > > We have a counters table, with the following structure: > > RowKey =3D A combination of field values representing group by key. > CF =3D time span aggregate (Hour, Day, Month). Currently we have only for= Hour. > CQ =3D Round-to-Hour timestamp (long). > Value =3D The count > > We collect raw data, and updates the counters table for the matched group= by key, hour. > We tried using Increment, but discovered its very very slow. > Instead we've decided to update the counters upon compaction. We write th= e deltas into the same row-key, but a longer column qualifier: . > is: Delta or Aggregate. > Delta stands for a delta column qualifier we send from our client. > > in the preCompact, I create an InternalScanner which aggregates the delta= column qualifier values and generates a new key value with Type Aggregate:= > > The problem with this implementation that it consumes more memory. > > Now, I've tried avoiding the creation of the Aggregate type KV, by simply= re-using the 1st delta column qualifier: simply changing its value in the = KeyValue. > But from some reason, after a couple of minor / major compactions, I see = data loss, when I count the values and compare them to the expected. > > > The question is: is it "legal" to change a KV I received from the Interna= lScanner before adding it the Result - i..e returning it from my own Intern= alScanner? > > > > > > > On Feb 12, 2013, at 8:44 AM, Anoop Sam John wrote: > >> Asaf, >> You have created a wrapper around the original InternalScanner = instance created by the compaction flow? >> >>> Where do the KV generated during the compaction process queue up before= being written to the disk? Is this buffer configurable? >> When I wrote the Region Observer my assumption was the the compaction pr= ocess works in Streaming fashion, thus even if I decide to generate a KV pe= r KV I see, it still shouldn't be a problem memory wise. >> >> There is no queuing. Your assumption is correct only. It is written to t= he writer as and when. (Just like how memstore flush doing the HFile write)= As Lars said a look at your code can tell if some thing is going wrong. D= o you have blooms being used? >> >> -Anoop- >> ________________________________________ >> From: Mesika, Asaf [asaf.mesika@gmail.com] >> Sent: Tuesday, February 12, 2013 11:16 AM >> To: user@hbase.apache.org >> Subject: Custom preCompact RegionObserver crashes entire cluster on OOME= : Heap Space >> >> Hi, >> >> I wrote a RegionObserver which does preCompact. >> I activated in pre-production, and then entire cluster dropped dead: One= RegionServer after another crashed on OutOfMemoryException: Heap Space. >> >> My preCompact method generates a KeyValue per each set of Column Qualifi= ers it sees. >> When I remove the coprocessor and restart the cluster, cluster remains s= table. >> I have 8 RS, each has 4 GB Heap. There about 9 regions (from a specific = table I'm working on) per Region Server. >> Running HBase 0.94.3 >> >> The crash occur when the major compaction fires up, apparently cluster w= ide. >> >> >> My question is this: Where do the KV generated during the compaction pro= cess queue up before being written to the disk? Is this buffer configurable= ? >> When I wrote the Region Observer my assumption was the the compaction pr= ocess works in Streaming fashion, thus even if I decide to generate a KV pe= r KV I see, it still shouldn't be a problem memory wise. >> >> Of course I'm trying to improve my code so it will generate much less ne= w KV (by simply altering the existing KVs received from the InternalScanner= ). >> >> Thank you, >> >> Asaf=