Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 22792 invoked from network); 10 Jul 2009 04:55:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Jul 2009 04:55:50 -0000 Received: (qmail 41152 invoked by uid 500); 10 Jul 2009 04:55:59 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 41099 invoked by uid 500); 10 Jul 2009 04:55:59 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 41089 invoked by uid 99); 10 Jul 2009 04:55:59 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jul 2009 04:55:59 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryanobjc@gmail.com designates 209.85.210.185 as permitted sender) Received: from [209.85.210.185] (HELO mail-yx0-f185.google.com) (209.85.210.185) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jul 2009 04:55:51 +0000 Received: by yxe15 with SMTP id 15so1089315yxe.5 for ; Thu, 09 Jul 2009 21:55:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=IIv8fDuHPkVue3c5kKBplfgHnNlR7DOP4TLeOl/X15U=; b=oa0/8GAvuj88l5B2qIhMfO7kEHXFN/4N5IJW+S74v0UBvr6d1J6+Ls2FH9bp7lRNB9 WpdRLyB0ktbWqrAGcDK+0NkBZqbQ37dbQH0jt8lr/GkPdSXP/2HKlvmQ4XkvIPI4baW8 59XtTVC1tYi6DWFmWKC1P74KOosRCmxj7aYBo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=Pz38NIRXwv9eVulu3/5+tJxotdW3PEN9ek45wq04CMJ4codSfWhYn13cghAZgU9VPS LkRwh9PxZUJYLz6TGkCGr0kqs0sitcBjMEug3BNf39w6D5CAPIfu2i2nwGHEQAD+v+ns wJTRV/eiqOTxsBTMJOH6y6GIVJ7cv/Dh7dqE4= MIME-Version: 1.0 Received: by 10.150.121.5 with SMTP id t5mr2458615ybc.40.1247201729210; Thu, 09 Jul 2009 21:55:29 -0700 (PDT) In-Reply-To: References: <78568af10907092129g4657a1e3g5b9125d076a9fe77@mail.gmail.com> Date: Thu, 9 Jul 2009 21:55:29 -0700 Message-ID: <78568af10907092155t24f7ecc4ucbcaba66475db3c5@mail.gmail.com> Subject: Re: Question about HBase From: Ryan Rawson To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org That size is not memory-resident, so the total data size is not an issue. The index size is what limits you with RAM, and its about 1 MB per region (256MB region). -ryan On Thu, Jul 9, 2009 at 9:51 PM, zsongbo wrote: > Hi Ryan, > > Thanks. > > If your regionsize is about 250MB, than 400 regions can store 100GB data = on > each regionserver. > Now, if you have 100TB data, then you need 1000 regionservers. > We are not google or yahoo who have so many nodes. > > Schubert > > On Fri, Jul 10, 2009 at 12:29 PM, Ryan Rawson wrote: > >> re: #2: in fact we don't know that... I know that I ran run 200-400 >> regions on a regionserver with a heap size of 4-5gb. =A0More even. =A0I >> bet I could have 1000 regions open on 4gb ram. =A0Each region is ~ 1mb >> of all the time data, so there we go. >> >> As for compactions, they are fairly fast, 0-30s or so depending on a >> number of factors. =A0Practically speaking it has not been a problem for >> me, and I've put 1200 gb into hbase so far. >> >> On Thu, Jul 9, 2009 at 8:58 PM, zsongbo wrote: >> > Hi all, >> > >> > 1. In this configuration property: >> > >> > =A0 >> > =A0 =A0hbase.hstore.compactionThreshold >> > =A0 =A03 >> > =A0 =A0 >> > =A0 =A0If more than this number of HStoreFiles in any one HStore >> > =A0 =A0(one HStoreFile is written per flush of memcache) then a compac= tion >> > =A0 =A0is run to rewrite all HStoreFiles files as one. =A0Larger numbe= rs >> > =A0 =A0put off compaction but when it runs, it takes longer to complet= e. >> > =A0 =A0During a compaction, updates cannot be flushed to disk. =A0Long >> > =A0 =A0compactions require memory sufficient to carry the logging of >> > =A0 =A0all updates across the duration of the compaction. >> > =A0 =A0If too large, clients timeout during compaction. >> > =A0 =A0 >> > =A0 >> > >> > >> > That says "During a compaction, updates cannot be flushed to disk." >> > Does it mean that, when compaction, the memcache cannot be flushed to >> disk? >> > I think it is not good. >> > >> > 2. We know that HBase cannot serve too many regions on each regionserv= er. >> If >> > only 200 regions(256MB), only 50GB storage can be used. >> > I my tested whith have 1.5GB heap and 256MB regionsize, each regionser= ver >> > can support 150 regions, and then OutOfMem. >> > Can anybody explain more detail here of the reason? >> > >> > To use more storage, can I set larger regionsize? such as 1GB, 10GB? >> > I have worry about the compaction time would be long with so large >> regions. >> > >> > Schubert >> > >> >