Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 16433 invoked from network); 24 Feb 2010 17:30:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Feb 2010 17:30:02 -0000 Received: (qmail 84973 invoked by uid 500); 24 Feb 2010 17:30:01 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 84913 invoked by uid 500); 24 Feb 2010 17:30:01 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 84903 invoked by uid 99); 24 Feb 2010 17:30:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Feb 2010 17:30:01 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bmdevelopment@gmail.com designates 209.85.220.213 as permitted sender) Received: from [209.85.220.213] (HELO mail-fx0-f213.google.com) (209.85.220.213) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Feb 2010 17:29:55 +0000 Received: by fxm5 with SMTP id 5so5497397fxm.29 for ; Wed, 24 Feb 2010 09:29:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=/Vcryp0XgypSKe3pxW8blJ36R6qCvyX3+ueCs5D/YAk=; b=TLBEJL5AvpaJEexWG46ZhpMYCcIK5oLSvR09Cu1ydn1VGrbPp0D178judh1O0ARqPs St79dBLcfHBwcirt4unCnisx0sKyN5/fyiSinzya1V+4mW/QAmPIvjZde4l9r/oWotLG +4VuhopCzI2U27mchaXcEmW2/W2sSBkX0OOFY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=Y/7wQOKJcWR/GUGDhJWsYJ3xS80xInPm52j7cmwBrwNeoJ8nO7TyvS7rWTwAqH1nmU U8nQsFD+3cUZnB4VmrRiV7tZLtlYmYHWDI7aChN8rhtHON3kiDXs90pudshGKhdByYhr 69K3c68LZiRdzMeZTvc6axRr726zAGHrTR/mE= MIME-Version: 1.0 Received: by 10.223.68.143 with SMTP id v15mr96511fai.62.1267032573706; Wed, 24 Feb 2010 09:29:33 -0800 (PST) In-Reply-To: <7c962aed1002240838j254bd745g7b2fa6fd287fe84c@mail.gmail.com> References: <40d25e011002230734l330c8e1bq36051b341936ae1e@mail.gmail.com> <31a243e71002230909k3d52ae18md87116aa4fc5b71e@mail.gmail.com> <40d25e011002230921u5fc7b635qb3a6f33536b0e261@mail.gmail.com> <7c962aed1002230959h2094c4c4pae3c23f4e5e16e4d@mail.gmail.com> <40d25e011002231040t31a19324y9d31d37ddeb5fd3e@mail.gmail.com> <7c962aed1002231218w40b91dd0y3ea8967f958010b7@mail.gmail.com> <40d25e011002231240j480e9694jdb2dca4fce21df5f@mail.gmail.com> <40d25e011002240728y6f8075b3p409dce5a3b905242@mail.gmail.com> <7c962aed1002240838j254bd745g7b2fa6fd287fe84c@mail.gmail.com> Date: Wed, 24 Feb 2010 12:29:33 -0500 Message-ID: <40d25e011002240929q63c7c785x40dc5238168083b6@mail.gmail.com> Subject: Re: OOME Java heap space From: Bluemetrix Development To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks, So, 3 nodes (2 CPU 8GB RAM) - one shared master/RS and two other RS. Two tables: table1 about 8M rows and table2 about 10M rows. A few days ago, I went to insert another 8M or so into table1 using MR and HBase crashed fairly early on in the process. Haven't been able to bring it back up since then due to all 3 RS reporting OOME. I'm going to go ahead and try to re-insert the data into table1 and see if I can get the full 16M in without issue. Or try to investigate the crash again. If there is anything you can think of that I should watch out for, please let me know. I'll report back with some results. On Wed, Feb 24, 2010 at 11:38 AM, Stack wrote: > Then, the content of the file is not the issue. =C2=A0Something else is g= oing on. > > The location of an OOME is not necessarily a pointer at the culprit. > It can be, but usually the culprit has done its nasty work and some > innocent trying to get a bit of memory ends up throwing the OOME. > > So, lets start over. =C2=A0How many regionservers. =C2=A0How many regions= . =C2=A0You > are just trying to start up hbase? > > St.Ack > > On Wed, Feb 24, 2010 at 7:28 AM, Bluemetrix Development > wrote: >> Hi, >> Scanning the KV pairs using HFile as you suggested, the biggest value >> I came across was about 2400 characters long string, with that >> particular row having 25000 cells from what I can tell. Is this big >> enough to cause the problem? >> There were a dozen or so values over 1000 chars long, but mainly small >> values under 100 chars as I mentioned earlier. >> >> If I have to set a hard limit on the length of cell values, that is >> not a problem for the moment - I can chop these strings down. >> >> Thanks >> >> On Tue, Feb 23, 2010 at 3:40 PM, Bluemetrix Development >> wrote: >>> Well, the cells themselves should not be too big. Just a few Strings >>> (url length) or ints at the most per cell. >>> Its just that there could be 10M (or maybe even 100M) cells per row. >>> I'm on the latest 0.20.3. >>> I'll try to find the big record as you suggested earlier and see what >>> it looks like. >>> Thanks >>> >>> On Tue, Feb 23, 2010 at 3:18 PM, Stack wrote: >>>> On Tue, Feb 23, 2010 at 10:40 AM, Bluemetrix Development >>>> wrote: >>>>> >>>>> If this is the case tho, how big is too big? >>>> >>>> Each cell and its coordinates is read into memory. =C2=A0If not enough >>>> memory, then OOME. >>>> >>>> Or does it depend on my >>>>> disk/memory resources? >>>>> I'm currently using dynamic column qualifiers, so I could have been r= eaching >>>>> rows with 10s of millions of unique column qualifiers each. >>>> >>>> This should be fine as long as you are on a recent hbase. >>>> >>>> I'd say it a big cell or many big cells concurrently that caused the O= OME. >>>> >>>> >>>>> Or, with other tables using timestamps as another dimension to the >>>>> data, and therefore >>>>> reaching 10s of millions of versions. >>>>> (I was trying to get HBase back up so I could count these numbers.) >>>>> >>>>> What limits should I use for the time being for number of qualifiers >>>>> and number of timestamps/versions? >>>> >>>> Shouldn't be an issue. >>>> >>>> St.Ack >>>> >>> >> >