Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 23FEF17F29 for ; Fri, 31 Oct 2014 17:53:43 +0000 (UTC) Received: (qmail 19395 invoked by uid 500); 31 Oct 2014 17:53:41 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 19328 invoked by uid 500); 31 Oct 2014 17:53:41 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 19314 invoked by uid 99); 31 Oct 2014 17:53:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Oct 2014 17:53:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates 209.85.160.175 as permitted sender) Received: from [209.85.160.175] (HELO mail-yk0-f175.google.com) (209.85.160.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Oct 2014 17:53:15 +0000 Received: by mail-yk0-f175.google.com with SMTP id q9so3517433ykb.34 for ; Fri, 31 Oct 2014 10:51:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=OZL+eGhu17FxU+eaRxJK+irUXsd4uzJ0zb+SBDy6uBI=; b=lreHge67un7RNgunfHINrFCOKewrKdztBAFQDE/0/K8njWUAD85aA88UlcbukBJ2nT 7StS3UU0Rfiepg+zizIIlc1HnTM+Yhv9SbfHjdGCBkfsE065QzRaZ93uVJZF08EADnOw M+whdjKjOwQATmntkzfqH0pF8sLnaYiQRu66v854AKy4QopxeAdbhEYBD8EmkcjdgsrX AXX8GQMtkUoxER4UCVbKtV3eHol9s35Ggk99c7DRLg+nlaTAZgPzWpVHTaTeUwvuqiXX Du7x9b98Gh14KDlLCe/fiBtJhWkuiwgAhASc1KxDGz9wOaEr/Q+E0Ba92mbwZ41BaIuz hfLQ== MIME-Version: 1.0 X-Received: by 10.236.61.6 with SMTP id v6mr14968833yhc.44.1414777904433; Fri, 31 Oct 2014 10:51:44 -0700 (PDT) Received: by 10.170.180.7 with HTTP; Fri, 31 Oct 2014 10:51:44 -0700 (PDT) In-Reply-To: References: Date: Fri, 31 Oct 2014 10:51:44 -0700 Message-ID: Subject: Re: Increasing write throughput.. From: Ted Yu To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=089e0160a63e6e105d0506bba674 X-Virus-Checked: Checked by ClamAV on apache.org --089e0160a63e6e105d0506bba674 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Gautum: bq. I'v attached a snapshot of the memstore size vs. flushQueueLen That didn't go through. Consider using third-party site. Answers to Stack's questions would help us get more clue. Cheers On Fri, Oct 31, 2014 at 10:47 AM, Stack wrote: > What version of hbase (later versions have improvements in write > throughput, especially when many writing threads). Post a pastebin of > regionserver log in steadystate if you don't mind. About how many writer= s > going into server at a time? How many regions on server. All being > written to at same rate or you have hotties? > Thanks, > St.Ack > > On Fri, Oct 31, 2014 at 10:22 AM, Gautam wrote: > > > I'm trying to increase write throughput of our hbase cluster. we'r > > currently doing around 7500 messages per sec per node. I think we have > room > > for improvement. Especially since the heap is under utilized and memsto= re > > size doesn't seem to fluctuate much between regular and peak ingestion > > loads. > > > > We mainly have one large table that we write most of the data to. Other > > tables are mainly opentsdb and some relatively small summary tables. Th= is > > table is read in batch once a day but otherwise is mostly serving write= s > > 99% of the time. This large table has 1 CF and get's flushed at around > > ~128M fairly regularly like below.. > > > > {log} > > > > 2014-10-31 16:56:09,499 INFO > org.apache.hadoop.hbase.regionserver.HRegion: > > Finished memstore flush of ~128.2 M/134459888, currentsize=3D879.5 K/90= 0640 > > for region > > > msg,00102014100515impression\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\= x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x002014100515040200= 049358\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\= x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x004138647301\x00\x00\x= 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\= x00\x00\x00\x00\x0002e5a329d2171149bcc1e83ed129312b\x00\x00\x00\x00,1413909= 604591.828e03c0475b699278256d4b5b9638a2. > > in 640ms, sequenceid=3D16861176169, compaction requested=3Dtrue > > > > {log} > > > > Here's a pastebin of my hbase site : http://pastebin.com/fEctQ3im > > > > What i'v tried.. > > - turned of major compactions , and handling these manually. > > - bumped up heap Xmx from 24G to 48 G > > - hbase.hregion.memstore.flush.size =3D 512M > > - lowerLimit/ upperLimit on memstore are defaults (0.38 , 0.4) since th= e > > global heap has enough space to accommodate the default percentages. > > - Currently running Hbase 98.1 on an 8 node cluster that's scaled up t= o > > 128GB RAM. > > > > > > There hasn't been any appreciable increase in write perf. Still hoverin= g > > around the 7500 per node write throughput number. The flushes still see= m > to > > be hapenning at 128M (instead of the expected 512) > > > > I'v attached a snapshot of the memstore size vs. flushQueueLen. the blo= ck > > caches are utilizing the extra heap space but not the memstore. The flu= sh > > Queue lengths have increased which leads me to believe that it's flushi= ng > > way too often without any increase in throughput. > > > > Please let me know where i should dig further. That's a long email, > thanks > > for reading through :-) > > > > > > > > Cheers, > > -Gautam. > > > --089e0160a63e6e105d0506bba674--