Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 12366 invoked from network); 16 Feb 2011 23:30:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Feb 2011 23:30:16 -0000 Received: (qmail 45110 invoked by uid 500); 16 Feb 2011 23:30:15 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 44942 invoked by uid 500); 16 Feb 2011 23:30:14 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 44933 invoked by uid 99); 16 Feb 2011 23:30:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Feb 2011 23:30:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vishal.kapoor.in@gmail.com designates 209.85.161.41 as permitted sender) Received: from [209.85.161.41] (HELO mail-fx0-f41.google.com) (209.85.161.41) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Feb 2011 23:30:08 +0000 Received: by fxm12 with SMTP id 12so1997985fxm.14 for ; Wed, 16 Feb 2011 15:29:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=vcDhsapcKhxeRbjZi+iy+8Jmf+VdG68xfd3ntA3Pkyg=; b=wYLLBSVBGIlTTZhm3xXYlt+QCvXS9PZmig2W0/e5az1yAx6s8kM/MHM/sK7A7ddrdr wZq1cTuwiMmZvs5xHQOYxLaQAC/q4Mg6vjo61IqMM57YOW8/lWXo78mRXM0hn5gYjs7i cZOznNvf8oMfacSy21RZnfFzP6V9gFoJrqZps= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=C8ATYvF8Y5XwmbdxWn+WZGRQB3U4Pmw6FRUXiuy4Is5DZH8OyEEwxlZL0Hy30q91B/ HNBh5Bn3+hzQNpkRtpt71ccEFlaZO3gSW8/cLyuN1wFn0+VZORrsBdoizh403zXF30sb z4u9M9WUMPqdqhzw4SQ6aVcS+I5jUgicQ/NGI= MIME-Version: 1.0 Received: by 10.223.86.135 with SMTP id s7mr1513545fal.70.1297898986382; Wed, 16 Feb 2011 15:29:46 -0800 (PST) Received: by 10.223.78.133 with HTTP; Wed, 16 Feb 2011 15:29:46 -0800 (PST) In-Reply-To: References: Date: Wed, 16 Feb 2011 18:29:46 -0500 Message-ID: Subject: Re: Hbase inserts very slow From: Vishal Kapoor To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=20cf3054a47d09a645049c6ea87c --20cf3054a47d09a645049c6ea87c Content-Type: text/plain; charset=ISO-8859-1 J-D, I also should mention that my data distribution in the three families are 1:1:1 I have three families so that I can have same qualifiers in them. and also the data in those families are LIVE:MasterA:MasterB Vishal On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans wrote: > Very often there's no need for more than 1 family, I would suggest you > explore that possibility first. > > J-D > > On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor > wrote: > > does that mean I am only left with the choice of writing to the three > > families in three different map jobs? > > or can I do it any other way? > > thanks, > > Vishal > > > > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans < > jdcryans@apache.org> > > wrote: > >> > >> First, loading into 3 families is currently a bad idea and is bound to > >> be inefficient, here's the reason why: > >> https://issues.apache.org/jira/browse/HBASE-3149 > >> > >> Those log lines mean that your scanning of the first table is > >> generating a log of block cache churn. When setting up the Map, set > >> your scanner to setCacheBlocks(false) before passing it to > >> TableMapReduceUtil.initTableMapperJob > >> > >> Finally, you may want to give more memory to the region server. > >> > >> J-D > >> > >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor > >> wrote: > >> > Lars, > >> > > >> > I am still working on pseudo distributed. > >> > hadoop-0.20.2+737/ > >> > and hbase-0.90.0 with the hadoop jar from the hadoop install. > >> > > >> > I have a LIVE_RAW_TABLE table, which gets values from a live system > >> > I go through each row of that table and get the row ids of two > reference > >> > tables from it. > >> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE > >> > I use > >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job); > >> > > >> > > >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a > composite > >> > key > >> > reverseTimeStamp/rowidA/rowIdB > >> > after that a run a bunch of map reduce to consolidate the data, > >> > to start with I have around 15000 rows in LIVE_RAW_TABLE. > >> > > >> > when I start with my job, i see it running quite well till i am almost > >> > done > >> > with 5000 rows > >> > then it starts printing the message in the logs, which I use to not > see > >> > before. > >> > the job use to run for around 900 sec ( I have a lot of data parsing > >> > while > >> > exploding ) > >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in > >> > LIVE_TABLE. > >> > > >> > after those debug messages, the job runs for around 2500 sec, > >> > I have not changed anything, including the table design. > >> > > >> > here is my table description. > >> > > >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER => > >> > 'NONE', > >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', TTL > => > >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE > => > >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => > '0', > >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE > >> > => > >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B', > >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1', > >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', > >> > IN_MEMORY > >> > => 'false', BLOCKCACHE => 'true'}]} > >> > > >> > thanks for all your help. > >> > > >> > Vishal > >> > > >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George > >> > wrote: > >> > > >> >> Hi Vishal, > >> >> > >> >> These are DEBUG level messages and are from the block cache, there is > >> >> nothing wrong with that. Can you explain more what you do and see? > >> >> > >> >> Lars > >> >> > >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor > >> >> wrote: > >> >> > all was working fine and suddenly I see a lot of logs like below > >> >> > > >> >> > 2011-02-15 22:19:04,023 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > started; Attempting to free 19.88 MB of total=168.64 MB > >> >> > 2011-02-15 22:19:04,025 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB, > >> >> > multi=92.37 > >> >> MB, > >> >> > memory=166.09 KB > >> >> > 2011-02-15 22:19:11,207 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > started; Attempting to free 19.88 MB of total=168.64 MB > >> >> > 2011-02-15 22:19:11,444 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB, > >> >> > multi=93.32 > >> >> MB, > >> >> > memory=166.09 KB > >> >> > 2011-02-15 22:19:21,494 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > started; Attempting to free 19.87 MB of total=168.62 MB > >> >> > 2011-02-15 22:19:21,760 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB, > >> >> > multi=92.73 > >> >> MB, > >> >> > memory=166.09 KB > >> >> > 2011-02-15 22:19:39,838 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > started; Attempting to free 19.87 MB of total=168.62 MB > >> >> > 2011-02-15 22:19:39,852 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB, > >> >> > multi=91.48 > >> >> MB, > >> >> > memory=166.09 KB > >> >> > 2011-02-15 22:19:49,768 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > started; Attempting to free 19.87 MB of total=168.62 MB > >> >> > 2011-02-15 22:19:49,770 DEBUG > >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > >> >> > eviction > >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB, > >> >> > multi=90.35 > >> >> MB, > >> >> > memory=166.09 KB > >> >> > > >> >> > > >> >> > I haven't changed anything including the table definitions. > >> >> > please let me know where to look... > >> >> > > >> >> > thanks, > >> >> > Vishal Kapoor > >> >> > > >> >> > >> > > > > > > --20cf3054a47d09a645049c6ea87c--