From hbase-user-return-3252-apmail-hadoop-hbase-user-archive=hadoop.apache.org@hadoop.apache.org Sat Feb 21 11:09:24 2009 Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 52202 invoked from network); 21 Feb 2009 11:09:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 Feb 2009 11:09:24 -0000 Received: (qmail 12259 invoked by uid 500); 21 Feb 2009 11:09:23 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 12237 invoked by uid 500); 21 Feb 2009 11:09:23 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 12226 invoked by uid 99); 21 Feb 2009 11:09:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Feb 2009 03:09:22 -0800 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ryanobjc@gmail.com designates 209.85.221.20 as permitted sender) Received: from [209.85.221.20] (HELO mail-qy0-f20.google.com) (209.85.221.20) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Feb 2009 11:09:13 +0000 Received: by qyk13 with SMTP id 13so2126246qyk.5 for ; Sat, 21 Feb 2009 03:08:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=JTFqRm1UurC29E/Aq+Wm4FYJSEjLaTI6dnJgHzlPTYI=; b=Qdg5XtjX5gTwSNKVjgZNgynzGiotHQdCdqAuZWSSRHyNtVUaJ2o0vyfyvRp8IFyPcQ cPB2BUjqQiN2XBgokp30tpeBaMhN09Ba2unPFKxroEMogouKYet0Q6V0R2OhrctBYVBW 8Jw04UjiZCOKOMy4ik3nQ5ndiIcZJQgTgsLGk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=r7wED/BbHc+RPuQT5mbC6lMDlzMyePw4Wc3soquQvlu96O3FWKwqE0GV95+pZ8xZvu NiSaR2wgDN2zBqRPYRiUSuTMPlzxAk/RvG2rQApb0WENl14O+yOcVPs7GXfxV0h7ntpQ HKdrQxKEZmOZ0wqKKAZ91AwIK1amnHntT5p4I= MIME-Version: 1.0 Received: by 10.224.20.17 with SMTP id d17mr3038739qab.38.1235214531798; Sat, 21 Feb 2009 03:08:51 -0800 (PST) In-Reply-To: <35a22e220902210221x72d909bas58b4e5b114ecfce4@mail.gmail.com> References: <35a22e220902202143u69afc33cq47eef604d7fc3021@mail.gmail.com> <35a22e220902202146h2fd72c8dq8126f09102d105cd@mail.gmail.com> <35a22e220902202155m682a8e28j8aa85dc2ecb71f20@mail.gmail.com> <7c962aed0902210010x16bc9328i8ea9043b906f6598@mail.gmail.com> <35a22e220902210101x2b8dd7fdq92920e7947890dcb@mail.gmail.com> <7c962aed0902210114m549d2362o3c809a447fe9013@mail.gmail.com> <35a22e220902210214j4ea81516we9006696e4de03c8@mail.gmail.com> <78568af10902210216w5af3d97akc988baed9d3a7497@mail.gmail.com> <35a22e220902210221x72d909bas58b4e5b114ecfce4@mail.gmail.com> Date: Sat, 21 Feb 2009 03:08:51 -0800 Message-ID: <78568af10902210308w52012b8fofb7f76f1e3c08376@mail.gmail.com> Subject: Re: Connection problem during data import into hbase From: Ryan Rawson To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0015175ce08e63a8e204636bcbba X-Virus-Checked: Checked by ClamAV on apache.org --0015175ce08e63a8e204636bcbba Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit you have to change hadoop-site.xml and restart HDFS. you should also change the logging to be more verbose in hbase - check out the hbase FAQ (link missing -ed). if you get the problem again, peruse the hbase logs and post what is going on there. the client errors dont really include the root cause on the regionserver side. good luck, -ryan On Sat, Feb 21, 2009 at 2:21 AM, Amandeep Khurana wrote: > I have 1 master + 2 slaves. I did set the timout to zero. I'll set the > xceivers to 2047 and try again. Can this be done in the job config or does > the site.xml need to be changed and the cluster restarted? > > Amandeep > > > Amandeep Khurana > Computer Science Graduate Student > University of California, Santa Cruz > > > On Sat, Feb 21, 2009 at 2:16 AM, Ryan Rawson wrote: > > > So the usual suspects are: > > > > - xcievers (i hvae mine set to 2047) > > - timeout (i have mine set to 0) > > > > I can import a few hundred million records with these settings. > > > > how many nodes do you have again? > > > > On Sat, Feb 21, 2009 at 2:14 AM, Amandeep Khurana > > wrote: > > > > > Yes, I noticed it this time. The regionserver gets slow or stops > > responding > > > and then this error comes. How do I get this to work? Is there a way of > > > limiting the resources that the map red job should take? > > > > > > I did make the changes in the config site similar to Larry Comton's > > config. > > > It only made the job go from dying at 7% to 12% this time. > > > > > > Amandeep > > > > > > > > > Amandeep Khurana > > > Computer Science Graduate Student > > > University of California, Santa Cruz > > > > > > > > > On Sat, Feb 21, 2009 at 1:14 AM, stack wrote: > > > > > > > It looks like regionserver hosting root crashed: > > > > > > > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out > > > trying > > > > to locate root region > > > > > > > > How many servers you running? > > > > > > > > You made similar config. to that reported by Larry Compton in a mail > > from > > > > earlier today? (See FAQ and Troubleshooting page for more on his > > listed > > > > configs.) > > > > > > > > St.Ack > > > > > > > > > > > > On Sat, Feb 21, 2009 at 1:01 AM, Amandeep Khurana > > > > wrote: > > > > > > > > > Yes, the table exists before I start the job. > > > > > > > > > > I am not using TableOutputFormat. I picked up the sample code from > > the > > > > docs > > > > > and am using it. > > > > > > > > > > Here's the job conf: > > > > > > > > > > JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class); > > > > > FileInputFormat.setInputPaths(conf, new > Path("import_data")); > > > > > conf.setMapperClass(MapClass.class); > > > > > conf.setNumReduceTasks(0); > > > > > conf.setOutputFormat(NullOutputFormat.class); > > > > > JobClient.runJob(conf); > > > > > > > > > > Interestingly, the hbase shell isnt working now either. Its giving > > > errors > > > > > even when I give the command "list"... > > > > > > > > > > > > > > > > > > > > Amandeep Khurana > > > > > Computer Science Graduate Student > > > > > University of California, Santa Cruz > > > > > > > > > > > > > > > On Sat, Feb 21, 2009 at 12:10 AM, stack wrote: > > > > > > > > > > > The table exists before you start the MR job? > > > > > > > > > > > > When you say 'midway through the job', are you using > > > tableoutputformat > > > > to > > > > > > insert into your table? > > > > > > > > > > > > Which version of hbase? > > > > > > > > > > > > St.Ack > > > > > > > > > > > > On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana < > > amansk@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > I dont know if this is related or not, but it seems to be. > After > > > this > > > > > map > > > > > > > reduce job, I tried to count the number of entries in the table > > in > > > > > hbase > > > > > > > through the shell. It failed with the following error: > > > > > > > > > > > > > > hbase(main):002:0> count 'in_table' > > > > > > > NativeException: java.lang.NullPointerException: null > > > > > > > from java.lang.String:-1:in `' > > > > > > > from org/apache/hadoop/hbase/util/Bytes.java:92:in > `toString' > > > > > > > from > > > > > > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in > > > > > > > `getMessage' > > > > > > > from > > > > > > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in > > > > > > > `' > > > > > > > from > > > org/apache/hadoop/hbase/client/HConnectionManager.java:841:in > > > > > > > `getRegionServerWithRetries' > > > > > > > from org/apache/hadoop/hbase/client/MetaScanner.java:56:in > > > > > `metaScan' > > > > > > > from org/apache/hadoop/hbase/client/MetaScanner.java:30:in > > > > > `metaScan' > > > > > > > from > > > org/apache/hadoop/hbase/client/HConnectionManager.java:411:in > > > > > > > `getHTableDescriptor' > > > > > > > from org/apache/hadoop/hbase/client/HTable.java:219:in > > > > > > > `getTableDescriptor' > > > > > > > from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0' > > > > > > > from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke' > > > > > > > from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke' > > > > > > > from java.lang.reflect.Method:-1:in `invoke' > > > > > > > from org/jruby/javasupport/JavaMethod.java:250:in > > > > > > > `invokeWithExceptionHandling' > > > > > > > from org/jruby/javasupport/JavaMethod.java:219:in `invoke' > > > > > > > from org/jruby/javasupport/JavaClass.java:416:in `execute' > > > > > > > ... 145 levels... > > > > > > > from > > org/jruby/internal/runtime/methods/DynamicMethod.java:74:in > > > > > > `call' > > > > > > > from > > > org/jruby/internal/runtime/methods/CompiledMethod.java:48:in > > > > > > `call' > > > > > > > from org/jruby/runtime/CallSite.java:123:in `cacheAndCall' > > > > > > > from org/jruby/runtime/CallSite.java:298:in `call' > > > > > > > from > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in > > > > > > > `__file__' > > > > > > > from > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in > > > > > > > `__file__' > > > > > > > from > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in > > > > > > > `load' > > > > > > > from org/jruby/Ruby.java:512:in `runScript' > > > > > > > from org/jruby/Ruby.java:432:in `runNormally' > > > > > > > from org/jruby/Ruby.java:312:in `runFromMain' > > > > > > > from org/jruby/Main.java:144:in `run' > > > > > > > from org/jruby/Main.java:89:in `run' > > > > > > > from org/jruby/Main.java:80:in `main' > > > > > > > from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in > `count' > > > > > > > from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in `count' > > > > > > > from (hbase):3:in `binding' > > > > > > > > > > > > > > > > > > > > > Amandeep Khurana > > > > > > > Computer Science Graduate Student > > > > > > > University of California, Santa Cruz > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 20, 2009 at 9:46 PM, Amandeep Khurana < > > > amansk@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Here's what it throws on the console: > > > > > > > > > > > > > > > > 09/02/20 21:45:29 INFO mapred.JobClient: Task Id : > > > > > > > > attempt_200902201300_0019_m_000006_0, Status : FAILED > > > > > > > > java.io.IOException: table is null > > > > > > > > at > > IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:33) > > > > > > > > at > IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:1) > > > > > > > > at > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > > > > > > > > at > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) > > > > > > > > at > org.apache.hadoop.mapred.Child.main(Child.java:155) > > > > > > > > > > > > > > > > attempt_200902201300_0019_m_000006_0: > > > > > > > > org.apache.hadoop.hbase.client.NoServerForRegionException: > > Timed > > > > out > > > > > > > trying > > > > > > > > to locate root region > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:768) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:448) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:430) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:557) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:457) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:430) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:557) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:461) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:423) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > org.apache.hadoop.hbase.client.HTable.(HTable.java:114) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > org.apache.hadoop.hbase.client.HTable.(HTable.java:97) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > IN_TABLE_IMPORT$MapClass.configure(IN_TABLE_IMPORT.java:120) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) > > > > > > > > attempt_200902201300_0019_m_000006_0: at > > > > > > > > org.apache.hadoop.mapred.Child.main(Child.java:155) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Amandeep Khurana > > > > > > > > Computer Science Graduate Student > > > > > > > > University of California, Santa Cruz > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 20, 2009 at 9:43 PM, Amandeep Khurana < > > > > amansk@gmail.com > > > > > > > >wrote: > > > > > > > > > > > > > > > >> I am trying to import data from a flat file into Hbase using > a > > > Map > > > > > > > Reduce > > > > > > > >> job. There are close to 2 million rows. Mid way into the > job, > > it > > > > > > starts > > > > > > > >> giving me connection problems and eventually kills the job. > > When > > > > the > > > > > > > error > > > > > > > >> comes, the hbase shell also stops working. > > > > > > > >> > > > > > > > >> This is what I get: > > > > > > > >> > > > > > > > >> 2009-02-20 21:37:14,407 INFO > org.apache.hadoop.ipc.HBaseClass: > > > > > > Retrying > > > > > > > connect to server: /171.69.102.52:60020. Already tried 0 > > time(s). > > > > > > > >> > > > > > > > >> What could be going wrong? > > > > > > > >> > > > > > > > >> Amandeep > > > > > > > >> > > > > > > > >> > > > > > > > >> Amandeep Khurana > > > > > > > >> Computer Science Graduate Student > > > > > > > >> University of California, Santa Cruz > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --0015175ce08e63a8e204636bcbba--