Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 67947 invoked from network); 9 Jun 2010 20:37:40 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Jun 2010 20:37:40 -0000 Received: (qmail 88960 invoked by uid 500); 9 Jun 2010 20:37:40 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 88936 invoked by uid 500); 9 Jun 2010 20:37:40 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 88928 invoked by uid 99); 9 Jun 2010 20:37:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 20:37:40 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ryanobjc@gmail.com designates 209.85.211.171 as permitted sender) Received: from [209.85.211.171] (HELO mail-yw0-f171.google.com) (209.85.211.171) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 20:37:33 +0000 Received: by ywh1 with SMTP id 1so5842834ywh.22 for ; Wed, 09 Jun 2010 13:37:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=LmT9O/ePm62Gb5wZrRuiE+3iy0187v7yA2RhIi1TX58=; b=oLPSol3VQlCFjE7ab29LRfc8jGtuIHuP7TFqFNTugDQtDw818BO8spBc7DL4SQTrR3 RfLyiNmJmdsfWDBw+US3zPnKzvZ5LLywF5uszRrEyvIdoRSAyKKy7fe1WgTo8dGzr+aQ ZZX59kFvHoSIXlYB9GnLfkuuKfLjnCzSn4FwE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=oXgFUBvX79QMzD3QhbNOo//3mpFhARApzxy7N0jsV3Bab+iFgW/RpLzo0bl6N+YuRj D9tI3zGiN6AWsDZmcBJ8OSB16SxoXaVGSXQYRnY4gCslUkLDPN72apFGAB8fITaejb4H ODv1dfqlF2x27nTp18/QCgbraPCQ223GOlp90= MIME-Version: 1.0 Received: by 10.229.235.208 with SMTP id kh16mr4763525qcb.285.1276115827798; Wed, 09 Jun 2010 13:37:07 -0700 (PDT) Received: by 10.220.161.147 with HTTP; Wed, 9 Jun 2010 13:37:07 -0700 (PDT) In-Reply-To: References: <4C0D7518.4070908@mozilla.com>

Date: Wed, 9 Jun 2010 13:37:07 -0700 Message-ID: Subject: Re: ideas to improve throughput of the base writting From: Ryan Rawson To: user@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org you also want this config: hbase.hregion.memstore.block.multiplier 8 that should hopefully clear things up. -ryan On Wed, Jun 9, 2010 at 1:34 PM, Jinsong Hu wrote: > I checked the log, there are lots of > > e 128.1m is >=3D than blocking 128.0m size > 2010-06-09 17:26:36,736 INFO org.apache.hadoop.hbase.regionserver.HRegion= : > Block > ing updates for 'IPC Server handler 8 on 60020' on region > Spam_MsgEventTable,201 > 0-06-09 05:25:32\x09c873847edf6e5390477494956ec04729,1276104002262: memst= ore > siz > e 128.1m is >=3D than blocking 128.0m size > > then after that there are lots of > > 2010-06-09 17:26:36,800 DEBUG org.apache.hadoop.hbase.regionserver.Store: > Added > hdfs://namenodes1.cloud.ppops.net:8020/hbase/Spam_MsgEventTable/376337880= /messag > e_compound_terms/7606939244559826252, entries=3D30869, sequenceid=3D83504= 47892, > mems > ize=3D7.2m, filesize=3D3.4m to Spam_MsgEventTable,2010-06-09 > 05:25:32\x09c873847edf6 > > > then lots of > > 2010-06-09 17:26:39,005 INFO org.apache.hadoop.hbase.regionserver.HRegion= : > Unblo > cking updates for region Spam_MsgEventTable,2010-06-09 > 05:25:32\x09c873847edf6e5 > 390477494956ec04729,1276104002262 'IPC Server handler 8 on 60020' > > > > This cycle happens again and again in the log. =A0 What can I do in this = case > to speed up writing ? > right now the writing speed is really slow. close to 4 rows/second for a > regionserver. > > I checked the code and try to find out why there are so many store files, > and I noticed each second > the regionserver reports to master, it calls the memstore flush and write= a > store file. > > the parameter hbase.regionserver.msginterval default value is 1 second. I= am > thinking to change to 10 second. > can that help ? I am also thinking to change hbase.hstore.blockingStoreFi= les > to 1000. =A0I noticed that there is a parameter > hbase.hstore.blockingWaitTime with default value of 1.5 minutes. as long = as > the 1.5 minutes is reached, > the compaction is executed. I am fine with running compaction every 1.5 > minutes, but running compaction every second > and causing CPU consistently higher than 100% is not wanted. > > Any suggestion what kind of parameters to change to improve my writing sp= eed > ? > > Jimmy > > > > > -------------------------------------------------- > From: "Ryan Rawson" > Sent: Wednesday, June 09, 2010 1:01 PM > To: > Subject: Re: ideas to improve throughput of the base writting > >> The log will say something like "blocking updates to..." when you hit >> a limit. =A0That log you indicate is just the regionserver attempting to >> compact a region, but shouldn't prevent updates. >> >> what else does your logfile say? =A0Search for the string (case >> insensitive) "blocking updates"... >> >> -ryan >> >> On Wed, Jun 9, 2010 at 11:52 AM, Jinsong Hu >> wrote: >>> >>> I made this change >>> >>> =A0hbase.hstore.blockingStoreFiles >>> =A015 >>> >>> >>> the system is still slow. >>> >>> Here is the most recent value for the region : >>> stores=3D21, storefiles=3D186, storefileSizeMB=3D9681, memstoreSizeMB= =3D128, >>> storefileIndexSizeMB=3D12 >>> >>> >>> And the same log still happens: >>> >>> 2010-06-09 18:36:40,577 WARN org.apache.h >>> adoop.hbase.regionserver.MemStoreFlusher: Region >>> SOME_ABCEventTable,2010-06-09 0 >>> 9:56:56\x093dc01b4d2c4872963717d80d8b5c74b1,1276107447570 has too many >>> store >>> fil >>> es, putting it back at the end of the flush queue. >>> >>> One idea that I have now is to further increase the >>> hbase.hstore.blockingStoreFiles to a very high >>> Number, such as 1000. =A0What is the negative impact of this change ? >>> >>> >>> Jimmy >>> >>> >>> -------------------------------------------------- >>> From: "Ryan Rawson" >>> Sent: Monday, June 07, 2010 3:58 PM >>> To: >>> Subject: Re: ideas to improve throughput of the base writting >>> >>>> Try setting this config value: >>>> >>>> >>>> =A0hbase.hstore.blockingStoreFiles >>>> =A015 >>>> >>>> >>>> and see if that helps. >>>> >>>> The thing about the 1 compact thread is the scarce resources being >>>> preserved in this case is cluster IO. =A0People have had issues with >>>> compaction IO being too heavy. >>>> >>>> in your case, this setting can let the regionserver build up more >>>> store files without pausing your import. >>>> >>>> -ryan >>>> >>>> On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu >>>> wrote: >>>>> >>>>> Hi, =A0There: >>>>> =A0While saving lots of data to =A0on hbase, I noticed that the >>>>> regionserver >>>>> CPU >>>>> went to more than 100%. examination shows that the hbase CompactSplit >>>>> is >>>>> spending full time working on compacting/splitting =A0hbase store fil= es. >>>>> The >>>>> machine I have is an 8 core machine. because there is only one >>>>> comact/split >>>>> thread in hbase, only one core is fully used. >>>>> =A0I continue to submit =A0map/reduce job to insert records to hbase.= most >>>>> of >>>>> the time, the job runs very fast, around 1-5 minutes. But occasionall= y, >>>>> it >>>>> can take 2 hours. That is very bad to me. I highly suspect that the >>>>> occasional slow insertion is related to the >>>>> insufficient speed =A0compactsplit thread. >>>>> =A0I am thinking that I should parallize the compactsplit thread, the >>>>> code >>>>> has >>>>> this =A0: the for loop "for (Store store: stores.values()) =A0" can b= e >>>>> parallized via java 5's threadpool , thus multiple cores are used >>>>> instead >>>>> only one core is used. I wonder if this will help to increase the >>>>> throughput. >>>>> >>>>> =A0Somebody mentioned that I can increase the regionsize to that I do= n't >>>>> do >>>>> so >>>>> many compaction. Under heavy writing situation. >>>>> does anybody have experience showing it helps ? >>>>> >>>>> Jimmy. >>>>> >>>>> >>>>> >>>>> =A0byte [] compactStores(final boolean majorCompaction) >>>>> >>>>> =A0throws IOException { >>>>> >>>>> =A0if (this.closing.get() || this.closed.get()) { >>>>> >>>>> =A0 LOG.debug("Skipping compaction on " + this + " because >>>>> closing/closed"); >>>>> >>>>> =A0 return null; >>>>> >>>>> =A0} >>>>> >>>>> =A0splitsAndClosesLock.readLock().lock(); >>>>> >>>>> =A0try { >>>>> >>>>> =A0 byte [] splitRow =3D null; >>>>> >>>>> =A0 if (this.closed.get()) { >>>>> >>>>> =A0 =A0 return splitRow; >>>>> >>>>> =A0 } >>>>> >>>>> =A0 try { >>>>> >>>>> =A0 =A0 synchronized (writestate) { >>>>> >>>>> =A0 =A0 =A0 if (!writestate.compacting && writestate.writesEnabled) { >>>>> >>>>> =A0 =A0 =A0 =A0 writestate.compacting =3D true; >>>>> >>>>> =A0 =A0 =A0 } else { >>>>> >>>>> =A0 =A0 =A0 =A0 LOG.info("NOT compacting region " + this + >>>>> >>>>> =A0 =A0 =A0 =A0 =A0 =A0 ": compacting=3D" + writestate.compacting + "= , >>>>> writesEnabled=3D" >>>>> + >>>>> >>>>> =A0 =A0 =A0 =A0 =A0 =A0 writestate.writesEnabled); >>>>> >>>>> =A0 =A0 =A0 =A0 =A0 return splitRow; >>>>> >>>>> =A0 =A0 =A0 } >>>>> >>>>> =A0 =A0 } >>>>> >>>>> =A0 =A0 LOG.info("Starting" + (majorCompaction? " major " : " ") + >>>>> >>>>> =A0 =A0 =A0 =A0 "compaction on region " + this); >>>>> >>>>> =A0 =A0 long startTime =3D System.currentTimeMillis(); >>>>> >>>>> =A0 =A0 doRegionCompactionPrep(); >>>>> >>>>> =A0 =A0 long maxSize =3D -1; >>>>> >>>>> =A0 =A0 for (Store store: stores.values()) { >>>>> >>>>> =A0 =A0 =A0 final Store.StoreSize ss =3D store.compact(majorCompactio= n); >>>>> >>>>> =A0 =A0 =A0 if (ss !=3D null && ss.getSize() > maxSize) { >>>>> >>>>> =A0 =A0 =A0 =A0 maxSize =3D ss.getSize(); >>>>> >>>>> =A0 =A0 =A0 =A0 splitRow =3D ss.getSplitRow(); >>>>> >>>>> =A0 =A0 =A0 } >>>>> >>>>> =A0 =A0 } >>>>> >>>>> =A0 =A0 doRegionCompactionCleanup(); >>>>> >>>>> =A0 =A0 String timeTaken =3D >>>>> StringUtils.formatTimeDiff(System.currentTimeMillis(), >>>>> >>>>> =A0 =A0 =A0 =A0 startTime); >>>>> >>>>> =A0 =A0 LOG.info("compaction completed on region " + this + " in " + >>>>> timeTaken); >>>>> >>>>> =A0 } finally { >>>>> >>>>> =A0 =A0 synchronized (writestate) { >>>>> >>>>> =A0 =A0 =A0 writestate.compacting =3D false; >>>>> >>>>> =A0 =A0 =A0 writestate.notifyAll(); >>>>> >>>>> =A0 =A0 } >>>>> >>>>> =A0 } >>>>> >>>>> =A0 return splitRow; >>>>> >>>>> =A0} finally { >>>>> >>>>> =A0 splitsAndClosesLock.readLock().unlock(); >>>>> >>>>> =A0} >>>>> >>>>> =A0} >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >