Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 76868 invoked from network); 18 May 2010 01:33:26 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 May 2010 01:33:26 -0000 Received: (qmail 42670 invoked by uid 500); 18 May 2010 01:33:24 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 42628 invoked by uid 500); 18 May 2010 01:33:24 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 42595 invoked by uid 99); 18 May 2010 01:33:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 01:33:24 +0000 X-ASF-Spam-Status: No, hits=-0.3 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryanobjc@gmail.com designates 209.85.222.177 as permitted sender) Received: from [209.85.222.177] (HELO mail-pz0-f177.google.com) (209.85.222.177) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 01:33:19 +0000 Received: by pzk7 with SMTP id 7so4070222pzk.30 for ; Mon, 17 May 2010 18:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=X/Of6/esfPR28vvpf3ERE4aC/1qYtVvYYjpuCeL8qjI=; b=Wv5BIGs55qiXgMB2YPVBZd2/RBTQFGw7H3Oh4RRHINLY2/sXCDc4TC6VOxyAqgYKuT B4udQuT2b8i0IxKKXcTgXJWzsnxEztwA4g07ZGf7MLSNYRQ/5cbVJzN7d6HSkApLQVci zyyu54x8q/0FTEO384Gr1VZA8HKGlZxaAOLA8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=DutdNS/IJIgC/zZkW3ZGPGknCcRiBSB2Afd5JRIsmBwngjV4kdGmsk2eYGlaRY8mhh lTSrYvm1EjBYXDsvphkZmqWKRRd7ZjnLsL7IB4kcqh9DIKRbFAmj8PeGe/s9DxeGxeOK TbhUuOVz2JtjiSPEHttlecBKvDyO1NxNfeVPI= MIME-Version: 1.0 Received: by 10.141.89.17 with SMTP id r17mr4272758rvl.185.1274146379340; Mon, 17 May 2010 18:32:59 -0700 (PDT) Received: by 10.140.147.19 with HTTP; Mon, 17 May 2010 18:32:59 -0700 (PDT) In-Reply-To: <2D6136772A13B84E95DF6DA79E85A9F0011D14E85BF2@NSPEXMBX-A.the-lab.llnl.gov> References: <2D6136772A13B84E95DF6DA79E85A9F0011D14E85BF2@NSPEXMBX-A.the-lab.llnl.gov> Date: Mon, 17 May 2010 18:32:59 -0700 Message-ID: Subject: Re: error adding row to table in 0.20.4 From: Ryan Rawson To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hey, The way to fix this is to combine regions with the Merge tool. If your table is small you could combine all regions (pair-wise at a time). If your table is too large, you can merge regions that are 'wacky' with adjacent members that are ok. For example: Region1 A->B Region2 B->D Region3 B->C Region4 C->D Region5 D->E In this case, regions 2-4 are "weird". If they were merged you'd end up with 3 regions: Region1 A->B RegionNew B->D Region5 D->E And all would be ok. In this case you need to do 2 merges: merge 2-3 -> A merge A-4 -> New This example can be extended to any number of weird regions. Don't worry about if the resulting regions are too big, HBase will split when it opens them. The merge tool is available like so: bin/hbase org.apache.hadoop.hbase.util.Merge It takes the table name and the region names. Be sure to copy those before you take your cluster offline or you might find it hard to find the region names! Good luck! -ryan On Mon, May 17, 2010 at 6:09 PM, Buttler, David wrote: > Hi all, > I recently upgraded to 0.20.4. =A0I am not trying to add additional data = to my system, and I am getting the following error on my client > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to conta= ct region server Some server, retryOnlyOne=3Dtrue, index=3D0, islastrow=3Dt= rue, tries=3D9, numtries=3D10, i=3D1, listsize=3D2, region=3Ddoc,7d6442c795= 1b178a6adc9c149ff13d6ea87feccd,1274142309679 for region doc,7d6442c7951b178= a6adc9c149ff13d6ea87feccd,1274142309679, row '7e0b8ec68d795612df55144b67e20= 7bdf805d36f', but failed after 10 attempts. > Exceptions: > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at org.apache.hadoop.hbase.client.HConnect= ionManager$TableServers$Batch.process(HConnectionManager.java:1167) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at org.apache.hadoop.hbase.client.HConnect= ionManager$TableServers.processBatchOfRows(HConnectionManager.java:1248) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at org.apache.hadoop.hbase.client.HTable.f= lushCommits(HTable.java:666) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at trinidad.hbase.mapreduce.ingest.ImportW= oS$WoSParserMapper.cleanup(ImportWoS.java:192) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at org.apache.hadoop.mapreduce.Mapper.run(= Mapper.java:146) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at org.apache.hadoop.mapred.MapTask.runNew= Mapper(MapTask.java:621) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at org.apache.hadoop.mapred.MapTask.run(Ma= pTask.java:305) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at org.apache.hadoop.mapred.Child.main(Chi= ld.java:170) > > When I look at the region server log, I see errors like: > > 2010-05-17 17:47:11,685 DEBUG org.apache.hadoop.hbase.regionserver.HRegio= nServer: Batch puts interrupted at index=3D0 because:Requested row out of r= ange for HRegion doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679= , startKey=3D'7d6442c7951b178a6adc9c149ff13d6ea87feccd', getEndKey()=3D'7dd= f19f548f2a75c53a638a4bdc88084f806be4e', row=3D'7e3d88c2ed5e2b02fe374333fb5d= 7502c6c5ff45' > > To me, it looks like the table has gaps between the end of one region and= the beginning of the next region. =A0E.g., from the list of regions from t= he doc table: > doc,005bccc8dcd6ae360b359f42438fd1a651c02048,1274141748324 =A0 =A0 =A0 = =A0 =A0 node-03:60030 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A051561009 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 005bccc8dcd6ae360b359f42438fd1a651c02048 =A0 =A000d= 79413bba4fbd869b0b58c3b23ad2b6fc960b4 > doc,00d79413bba4fbd869b0b58c3b23ad2b6fc960b4,1274141747257 =A0 =A0 =A0 = =A0 node-02:60030 494463444 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 00d79413bba4fbd= 869b0b58c3b23ad2b6fc960b4 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 013485105e0d328d4= 65b2607057f92cb5f920011 > ... > doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679 =A0 =A0 =A0 = =A0 =A0 =A0node-03:60030 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A01541672177 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 7d6442c7951b178a6adc9c149ff13d6ea87feccd =A0 = =A07ddf19f548f2a75c53a638a4bdc88084f806be4e > doc,7e7b8dbcec790d28f4154e012226f6d6902a5ac9,1274142333168 =A0 =A0 =A0 = =A0 =A0node-03:60030 1688440578 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 7e7b8dbcec7= 90d28f4154e012226f6d6902a5ac9 7ee05fd423269986ceb0dd88b1e4f73de42c5c5e > ... > > It looks like the first couple of regions are fine, but later regions hav= e gaps. > > I tried restarting hbase, doing a major compaction, and splitting the reg= ions, none of which fixed the problem. =A0I was thinking of trying to copy = the table and seeing if that helped, but I can't seem to run the copy_table= .rb script either: > [hadoop@nz bin]$ /opt/hbase/bin/hbase org.jruby.Main copy_table.rb > file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/META-INF/jruby.home/= lib/ruby/site_ruby/1.8/builtin/javasupport/core_ext/object.rb:33:in `get_pr= oxy_or_package_under_package': cannot load Java class org.apache.hadoop.hba= se.regionserver.HLogEdit (NameError) > =A0 =A0 =A0 =A0from file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/= META-INF/jruby.home/lib/ruby/site_ruby/1.8/builtin/javasupport/java.rb:51:i= n `method_missing' > =A0 =A0 =A0 =A0from copy_table.rb:40 > > > Any suggestions? > > Thanks, > Dave >