Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 738 invoked from network); 7 Jun 2009 01:19:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Jun 2009 01:19:45 -0000 Received: (qmail 55796 invoked by uid 500); 7 Jun 2009 01:19:56 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 55744 invoked by uid 500); 7 Jun 2009 01:19:56 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 55734 invoked by uid 99); 7 Jun 2009 01:19:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Jun 2009 01:19:56 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryanobjc@gmail.com designates 74.125.46.28 as permitted sender) Received: from [74.125.46.28] (HELO yw-out-2324.google.com) (74.125.46.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Jun 2009 01:19:46 +0000 Received: by yw-out-2324.google.com with SMTP id 9so1291997ywe.29 for ; Sat, 06 Jun 2009 18:19:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=wKbjN+r8Ckn4aUqMqB37B8EASgwGUW4rF4DhelxNYis=; b=NEnjgrCWt+ok+Jc/7n6h1pOjdGUCh/lE37hhlBIGS5x/qL39SYfOQC0a4wkce/HHTQ Ag6IjIfCeR+64MoDiJb27JKotleoKilCq1k0P6MF4hfJ5x5ovxl930rgVEpTLa1XK6vR K9VUBWh+Jt8w2uf3FN1iKkWv7UU2D1Dq/Unyk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=UuvhuBG7gF7zJ0RyUZ6OOrjSK5c8KgohIRSMyqNyiNg8EjJ5/d05kCI9t0CPXu7eec R908ARHdjj15cmj/9lE5DAt2Gb4DZwJQ9ZeIccP+V9asZMMoFB2wNdj/Pr4+SR6SrW5Y TnwBE0BJopLJhTYDoDntSseOor8PrlkrRjqec= MIME-Version: 1.0 Received: by 10.150.98.21 with SMTP id v21mr9294540ybb.295.1244337565918; Sat, 06 Jun 2009 18:19:25 -0700 (PDT) In-Reply-To: <23907040.post@talk.nabble.com> References: <23906724.post@talk.nabble.com> <78568af10906061727r6050b505i1b314d50d7238de9@mail.gmail.com> <23906943.post@talk.nabble.com> <78568af10906061759l505a4526x7aebc7cf99651f5a@mail.gmail.com> <23907040.post@talk.nabble.com> Date: Sat, 6 Jun 2009 18:19:25 -0700 Message-ID: <78568af10906061819g5949eae8ye0f30653540fb535@mail.gmail.com> Subject: Re: Frequent changing rowkey - HBase insert From: Ryan Rawson To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0cd4cce098e779046bb7ea8d X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd4cce098e779046bb7ea8d Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit In 0.20 things should get faster. Generally speaking I find HBase's insert performance really good. One of the best even. Plus Just Add Servers (tm). -ryan On Sat, Jun 6, 2009 at 6:13 PM, llpind wrote: > > Thanks Ryan, > > Yeah that sped it up a bit. > > I set : > table.setAutoFlush(false); > table.setWriteBufferSize(1024*1024*12); > > And it's inserting 1M in about 1 minute+ . Not the best still. > > 2009-06-06 18:06:54.894 ======PROCESSING RECORD: ====== @1000000 > 2009-06-06 18:08:07.725 ======PROCESSING RECORD: ====== @2000000 > 2009-06-06 18:09:24.992 ======PROCESSING RECORD: ====== @3000000 > 2009-06-06 18:11:13.279 ======PROCESSING RECORD: ====== @4000000 > > > Ryan Rawson wrote: > > > > Don't use the thrift gateway for bulk import. > > > > Use the Java API, and be sure to turn off auto flushing and use a > > reasonably > > sizable commit buffer. 1-12MB is probably ideal. > > > > i can push a 20 node cluster past 180k inserts/sec using this. > > > > On Sat, Jun 6, 2009 at 5:51 PM, llpind wrote: > > > >> > >> Thanks Ryan, well done. > >> > >> I have no experience using Thrift gateway, could you please provide some > >> actual code here or in your blog post? I'd love to see how your method > >> compares with mine. > >> > >> Last night I was able to do ~58 million records in ~1.6 hours using the > >> HBase Java API directly. But with this new data, I'm seeing much slower > >> times. After reading around, it appears it's because my row key now > >> changes > >> often, whearas before it was constant for some time (more columns). > >> Thanks > >> again. :) > >> > >> > >> Ryan Rawson wrote: > >> > > >> > Have a look at: > >> > > >> > > >> > http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html > >> > > >> > -ryan > >> > > >> > > >> > On Sat, Jun 6, 2009 at 4:55 PM, llpind > wrote: > >> > > >> >> > >> >> I'm doing an insert operation using the java API. > >> >> > >> >> When inserting data where the rowkey changes often, it seems the > >> inserts > >> >> go > >> >> really slow. > >> >> > >> >> Is there another method for doing inserts of this type? (instead of > >> >> BatchUpdate). > >> >> > >> >> Thanks > >> >> -- > >> >> View this message in context: > >> >> > >> > http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23906724.html > >> >> Sent from the HBase User mailing list archive at Nabble.com. > >> >> > >> >> > >> > > >> > > >> > >> -- > >> View this message in context: > >> > http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23906943.html > >> Sent from the HBase User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > View this message in context: > http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23907040.html > Sent from the HBase User mailing list archive at Nabble.com. > > --000e0cd4cce098e779046bb7ea8d--