Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 12681 invoked from network); 9 Apr 2009 08:09:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Apr 2009 08:09:46 -0000 Received: (qmail 50817 invoked by uid 500); 9 Apr 2009 08:09:45 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 50762 invoked by uid 500); 9 Apr 2009 08:09:45 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 50752 invoked by uid 99); 9 Apr 2009 08:09:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Apr 2009 08:09:45 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lars@worldlingo.com designates 204.15.165.130 as permitted sender) Received: from [204.15.165.130] (HELO email.worldlingo.com) (204.15.165.130) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Apr 2009 08:09:35 +0000 Received: (qmail 31316 invoked from network); 9 Apr 2009 08:09:12 -0000 Received: from unknown (HELO [192.168.2.106]) (larsgeorge@[192.168.66.8]) (envelope-sender ) by email.worldlingo.com (qmail-ldap-1.03) with SMTP for ; 9 Apr 2009 08:09:12 -0000 Received: from [192.168.2.106] ([79.207.74.82] helo=[192.168.2.106]) by assp.worldlingo.com; 9 Apr 2009 01:09:10 -0700 Message-ID: <49DDAD25.1030502@worldlingo.com> Date: Thu, 09 Apr 2009 10:09:09 +0200 From: Lars George User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) MIME-Version: 1.0 To: hbase-user@hadoop.apache.org CC: "Taylor, Ronald C" Subject: Re: Still need help with data upload into HBase References: <12A8FC4D0563D1468AD23AAAED861A3EDE040A@EMAIL02.pnl.gov> <78568af10904031755o14b9115crfc9c45de064952f8@mail.gmail.com> <12A8FC4D0563D1468AD23AAAED861A3EDE0412@EMAIL02.pnl.gov> <78568af10904061846w551bc12ene1846276438bbd55@mail.gmail.com> <12A8FC4D0563D1468AD23AAAED861A3EDE0430@EMAIL02.pnl.gov> In-Reply-To: <12A8FC4D0563D1468AD23AAAED861A3EDE0430@EMAIL02.pnl.gov> Content-Type: multipart/mixed; boundary="------------000909090608050706000305" X-Virus-Checked: Checked by ClamAV on apache.org --------------000909090608050706000305 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi Ron, The syntax is like this (sic): dfs.datanode.max.xcievers 4096 and it is documented on the HBase wiki here: http://wiki.apache.org/hadoop/Hbase/Troubleshooting Regards, Lars Taylor, Ronald C wrote: > > Hi Ryan, > > Thanks for the suggestion on checking whether the number of file handles > allowed actually gets increased after I make the change the > /etc/security/limits.conf. > > Turns out it was not. I had to check with one of our sysadmins so that > the new 32K number of handles setting actually gets used on my Red Hat > box. > > With that, and with one other change which I'll get to in a moment, I > finally was able to read in all the rows that I wanted, instead of the > program breaking before finishing. Checked the table by scanning it - > looks OK. So - it looks like things are working as they should. > > Thank you very much for the help. > > Now as to the other parameter that needed changing: I found that the > xceivers (xcievers?) limit was not being bumped up - I was crashing on > that. I went to add what Ryan suggested in hadoop-site.xml, i.e., > > > dfs.datanode.max.xcievers > 2047 > > > and discovered that I did not know whether to use > "dfs.datanode.max.xcievers" or "dfs.datanode.max.xceivers", where the > "i" and "e" switch. I was getting error msgs in the log files with > > "xceiverCount 257 exceeds the limit of concurrent xcievers 256" > > with BOTH spelling variants employed within the same error msg. Very > confusing. So I added property entries for both spellings in the > hadoop-site.xml file. Figured one of them would take effect. That > appears to work fine. But I would like get the correct spelling. Did a > Google search and the spelling keeps popping up both ways, so I remain > confused. > > I think the Hbase getting started documentation could use some > enhancement on file handle settings, xceiver (xciever?) settings, and > datanode handler count settings. > Ron > > ___________________________________________ > Ronald Taylor, Ph.D. > Computational Biology & Bioinformatics Group > Pacific Northwest National Laboratory > 902 Battelle Boulevard > P.O. Box 999, MSIN K7-90 > Richland, WA 99352 USA > Office: 509-372-6568 > Email: ronald.taylor@pnl.gov > www.pnl.gov > > -----Original Message----- > From: Ryan Rawson [mailto:ryanobjc@gmail.com] > Sent: Monday, April 06, 2009 6:47 PM > To: Taylor, Ronald C > Cc: hbase-user@hadoop.apache.org > Subject: Re: Still need help with data upload into HBase > > I ran into a problem on ubuntu where /etc/security/limits.conf wasnt > being honored due to a missing line in /etc/pam.d/common-session: > "session required pam_limits.so" > > this prevented the ulimits from being run. > > can you sudo to the hadoop/hbase user and verify with ulimit -a ? > > > > On Mon, Apr 6, 2009 at 5:07 PM, Taylor, Ronald C > wrote: > > >> Hello Ryan and the list, >> >> Well, I am still stuck. In addition to making the changes recommended >> by Ryan to my hadoop-site.xml file (see below), I also added a line >> for HBase to /etc/security/limits.conf and had the fs.file-max hugely >> increased, to hopefully handle any file handle limit problem. Still no >> > > >> luck with my upload program. It fails about where it did before, >> around the loading of the 160,000th row into the one table that I >> create in Hbase. Didn't the "too many file open" msg, but did get >> "handleConnectionFailure" in the same place in the upload. >> >> I then tried a complete reinstall of Hbase and Hadoop, upgrading from >> 0.19.0 to 0.19.1. Used the same config parameters as before, and reran >> > > >> the program. It fails again, at about the same number of rows uploaded >> > > >> - and I'm back to getting "too many files open" as what I think is the >> > > >> principal error msg. >> >> So - does anybody have any suggestions? I am running a >> > "pseudo-distributed" > >> installation of Hadoop on one Red Hat Linux machine with about ~3Gb of >> > RAM. > >> Are there any known problems with bulk uploads when running >> "pseudo-distributed" on on a single box, rather than a true cluster? >> Is there anything else I can try? >> Ron >> >> >> ___________________________________________ >> Ronald Taylor, Ph.D. >> Computational Biology & Bioinformatics Group Pacific Northwest >> National Laboratory >> 902 Battelle Boulevard >> P.O. Box 999, MSIN K7-90 >> Richland, WA 99352 USA >> Office: 509-372-6568 >> Email: ronald.taylor@pnl.gov >> www.pnl.gov >> >> >> ------------------------------ >> *From:* Ryan Rawson [mailto:ryanobjc@gmail.com] >> *Sent:* Friday, April 03, 2009 5:56 PM >> *To:* Taylor, Ronald C >> *Subject:* Re: FW: Still need help with data upload into HBase >> >> Welcome to hbase :-) >> >> This is pretty much how it goes for nearly every new user. >> >> We might want to review our docs... >> >> On Fri, Apr 3, 2009 at 5:54 PM, Taylor, Ronald C >> > wrote: > >>> Thanks. I'll make those settings, too, in addition to bumping up the >>> file handle limit, and give it another go. >>> Ron >>> >>> -----Original Message----- >>> From: Ryan Rawson [mailto:ryanobjc@gmail.com] >>> Sent: Friday, April 03, 2009 5:48 PM >>> To: hbase-user@hadoop.apache.org >>> Subject: Re: Still need help with data upload into HBase >>> >>> Hey, >>> >>> File handle - yes... there was a FAQ and/or getting started which >>> talks about upping lots of limits. >>> >>> I have these set in my hadoop-site.xml (that is read by datanode): >>> >>> dfs.datanode.max.xcievers >>> 2047 >>> >>> >>> >>> dfs.datanode.handler.count >>> 10 >>> >>> >>> I should probably set the datanode.handler.count higher. >>> >>> Don't forget to toss a reasonable amount of ram at hdfs... not sure >>> what that is exactly, but -Xmx1000m wouldn't hurt. >>> >>> On Fri, Apr 3, 2009 at 5:44 PM, Taylor, Ronald C >>> wrote: >>> >>> >>>> Hi Ryan, >>>> >>>> Thanks for the info. Re checking the Hadoop datanode log file: I >>>> just did so, and found a "too many open files" error. Checking the >>>> Hbase >>>> >>> FAQ, >>> >>>> I see that I should drastically bump up the file handle limit. So I >>>> >>> will >>> >>>> give that a try. >>>> >>>> Question: what does the xciver variable do? My hadoop-site.xml file >>>> >>> does >>> >>>> not contain any entry for such a var. (Nothing reported in the >>>> datalog file either with the word "xciver".) >>>> >>>> Re using the local file system: well, as soon as I load a nice data >>>> >>> set >>> >>>> loaded in, I'm starting a demo project manipulating it for our Env >>>> Molecular Sciences Lab (EMSL), a DOE Nat User Facility. And I'm >>>> >>> supposed >>> >>>> to be doing the manipulating using MapReduce programs, to show the >>>> usefulness of such an approach. So I need Hadoop and the HDFS. And >>>> so >>>> >>> I >>> >>>> would prefer to keep using Hbase on top of Hadoop, rather than the >>>> >>> local >>> >>>> Linux file system. Hopefully the "small HDFS clusters" issues you >>>> mention are survivable. Eventually, some of this programming might >>>> >>> wind >>> >>>> up on Chinook, our 160 Teraflop supercomputer cluster, but that's a >>>> >>> ways >>> >>>> down the road. I'm starting on my Linux desktop. >>>> >>>> I'll try bumping up the file handle limit, restart Hadoop and >>>> Hbase, >>>> >>> and >>> >>>> see what happens. >>>> Ron >>>> >>>> ___________________________________________ >>>> Ronald Taylor, Ph.D. >>>> Computational Biology & Bioinformatics Group Pacific Northwest >>>> National Laboratory >>>> 902 Battelle Boulevard >>>> P.O. Box 999, MSIN K7-90 >>>> Richland, WA 99352 USA >>>> Office: 509-372-6568 >>>> Email: ronald.taylor@pnl.gov >>>> www.pnl.gov >>>> >>>> >>>> -----Original Message----- >>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com] >>>> Sent: Friday, April 03, 2009 5:08 PM >>>> To: hbase-user@hadoop.apache.org >>>> Subject: Re: Still need help with data upload into HBase >>>> >>>> Hey, >>>> >>>> Can you check the datanode logs? You might be running into the >>>> >>> dreaded >>> >>>> xciver limit :-( >>>> >>>> try upping the xciver in hadoop-site.xml... i run at 2048. >>>> >>>> -ryan >>>> >>>> -----Original Message----- >>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com] >>>> Sent: Friday, April 03, 2009 5:13 PM >>>> To: hbase-user@hadoop.apache.org >>>> Subject: Re: Still need help with data upload into HBase >>>> >>>> Non replicated yet is probably what you think - HDFS hasnt place >>>> >>> blocks >>> >>>> on >>>> more nodes yet. This could be due to the pseudo distributed nature >>>> > > >>>> of your set-up. I'm not familiar with that configuration, so I >>>> can't really >>>> >>> say >>> >>>> more. >>>> >>>> If you only have 1 machine, you might as well just go with local >>>> >>> files. >>> >>>> The >>>> HDFS gets you distributed replication, but until you have many >>>> >>> machines, >>> >>>> it >>>> won't buy you anything and only cause problems, since small HDFS >>>> clusters are known to have issues. >>>> >>>> Good luck (again!) >>>> -ryan >>>> >>>> On Fri, Apr 3, 2009 at 5:07 PM, Ryan Rawson >>>> >>> wrote: >>> >>>>> Hey, >>>>> >>>>> Can you check the datanode logs? You might be running into the >>>>> >>>> dreaded >>>> >>>>> xciver limit :-( >>>>> >>>>> try upping the xciver in hadoop-site.xml... i run at 2048. >>>>> >>>>> -ryan >>>>> >>>>> >>>>> On Fri, Apr 3, 2009 at 4:35 PM, Taylor, Ronald C >>>>> >>>> wrote: >>>> >>>>>> Hello folks, >>>>>> >>>>>> I have just tried using Ryan's doCommit() method for my bulk >>>>>> upload >>>>>> >>>> into >>>> >>>>>> one Hbase table. No luck. I still start to get errors around row >>>>>> > > >>>>>> 160,000. On-screen, the program starts to generate error msgs >>>>>> like >>>>>> >>>> so: >>>> >>>>>> ... >>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already >>>>>> tried 8 time(s). >>>>>> Apr 3, 2009 2:39:52 PM >>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection >>>>>> handleConnectionFailure >>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already >>>>>> tried 9 time(s). >>>>>> Apr 3, 2009 2:39:57 PM >>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection >>>>>> handleConnectionFailure >>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already >>>>>> tried 0 time(s). >>>>>> Apr 3, 2009 2:39:58 PM >>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection >>>>>> handleConnectionFailure >>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already >>>>>> tried 1 time(s). >>>>>> ... >>>>>> In regard to log file information, I have appended at bottom >>>>>> some >>>>>> >>> of >>> >>>> the >>>> >>>>>> output from my hbase--master-.log file, at the >>>>>> place where it looks to me like things might have started to go >>>>>> > wrong. > >>>> Several >>>> >>>>>> questions: >>>>>> >>>>>> 1) Is there any readily apparent cause for such a >>>>>> HBaseClient$Connection handleConnectionFailure to occur in a >>>>>> Hbase installation configured on a Linux desktop to work in the >>>>>> pseudo-distributed operation mode? From my understanding, even >>>>>> >>>> importing >>>> >>>>>> ~200,000 rows (each row being filled with info for ten columns) >>>>>> is >>>>>> >>> a >>> >>>>>> minimal data set for Hbase, and upload should not be failing >>>>>> like >>>>>> >>>> this. >>>> >>>>>> FYI - minimal changes were made to the Hbase default settings in >>>>>> >>> the >>> >>>>>> Hbase ../conf/ config files when I installed Hbase 0.19.0. I >>>>>> have >>>>>> >>> one >>> >>>>>> entry in hbase-env.sh, to set JAVA_HOME, and one property entry >>>>>> in hbase-site.xml, to set the hbase.rootdir. >>>>>> >>>>>> 2) My Linux box has about 3 Gb of memory. I left the HADOOP_HEAP >>>>>> >>> and >>> >>>>>> HBASE_HEAP sizes at their default values, which I understand are >>>>>> >>>> 1000Mb >>>> >>>>>> each. Should I have changed either value? >>>>>> >>>>>> 3) I left the dfs.replication value at the default of "3" in >>>>>> the hadoop-site.xml file, for my test of pseudo-distributed >>>>>> > operation. > >>>>>> Should I have changed that to "1", for operation on my single >>>>>> >>>> machine? >>>> >>>>>> Downsizing to "1" would appear to me to negate trying out Hadoop >>>>>> > > >>>>>> in >>>>>> >>>> the >>>> >>>>>> pseudo-distributed operation mode, so I left the value "as is", >>>>>> but >>>>>> >>>> did >>>> >>>>>> I get this wrong? >>>>>> >>>>>> 4) In the log output below, you can see that Hbase starts to >>>>>> block >>>>>> >>>> and >>>> >>>>>> then unblock updates to my one Hbase table (called the >>>>>> "ppInteractionTable", for protein-protein interaction table). A >>>>>> >>>> little >>>> >>>>>> later, a msg says that the ppInteractionTable has been closed. >>>>>> At >>>>>> >>>> this >>>> >>>>>> point, my program has *not* issued a command to close the table >>>>>> - >>>>>> >>>> that >>>> >>>>>> only happens at the end of the program. So - why is this >>>>>> > happening? > >>>>>> Also, near the end of my log extract, I get a different error >>>>>> > msg: > >>>>>> NotReplicatedYetException. I have no idea what that means. >>>>>> >>> Actually, >>> >>>> I >>>> >>>>>> don't really have a grasp yet on what any of these error msgs is >>>>>> > > >>>>>> supposed to tell us. So - once again, any help would be much >>>>>> appreciated. >>>>>> >>>>>> Ron >>>>>> >>>>>> ___________________________________________ >>>>>> Ronald Taylor, Ph.D. >>>>>> Computational Biology & Bioinformatics Group Pacific Northwest >>>>>> National Laboratory >>>>>> 902 Battelle Boulevard >>>>>> P.O. Box 999, MSIN K7-90 >>>>>> Richland, WA 99352 USA >>>>>> Office: 509-372-6568 >>>>>> Email: ronald.taylor@pnl.gov >>>>>> www.pnl.gov >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Taylor, Ronald C >>>>>> Sent: Tuesday, March 31, 2009 5:48 PM >>>>>> To: 'hbase-user@hadoop.apache.org' >>>>>> Cc: Taylor, Ronald C >>>>>> Subject: Novice Hbase user needs help with data upload - gets a >>>>>> RetriesExhaustedException, followed by >>>>>> NoServerForRegionException >>>>>> >>>>>> >>>>>> Hello folks, >>>>>> >>>>>> This is my first msg to the list - I just joined today, and I am >>>>>> > > >>>>>> a novice Hadoop/HBase programmer. I have a question: >>>>>> >>>>>> I have written a Java program to create an HBase table and then >>>>>> >>> enter >>> >>>> a >>>> >>>>>> number of rows into the table. The only way I have found so far >>>>>> to >>>>>> >>> do >>> >>>>>> this is to enter each row one-by-one, creating a new BatchUpdate >>>>>> > > >>>>>> updateObj for each row, doing about ten updateObj.put()'s to add >>>>>> >>> the >>> >>>>>> column data, and then doing a tableObj.commit(updateObj). >>>>>> There's probably a more efficient way (happy to hear, if so!), >>>>>> but this is >>>>>> >>>> what >>>> >>>>>> I'm starting with. >>>>>> >>>>>> When I do this on input that creates 3000 rows, the program >>>>>> works >>>>>> >>>> fine. >>>> >>>>>> When I try this on input that would create 300,000 rows (still >>>>>> relatively small for an HBase table, I would think), the program >>>>>> > > >>>>>> terminates around row 160,000 or so, generating first an >>>>>> RetriesExhaustedException, followed by >>>>>> > NoServerForRegionException. > >>>> The >>>> >>>>>> HBase server crashes, and I have to restart it. The Hadoop >>>>>> server appears to remain OK and does not need restarting. >>>>>> >>>>>> Can anybody give me any guidance? I presume that I might need to >>>>>> >>>> adjust >>>> >>>>>> some setting for larger input in the HBase and/or Hadoop config >>>>>> >>>> files. >>>> >>>>>> At present, I am using default settings. I have installed Hadoop >>>>>> >>>> 0.19.0 >>>> >>>>>> and HBase 0.19.0 in the "pseudo" cluster mode on a single >>>>>> machine, >>>>>> >>> my >>> >>>>>> Red Hat Linux desktop, which has 3 Gb RAM. >>>>>> >>>>>> Any help / suggestions would be much appreciated. >>>>>> >>>>>> Cheers, >>>>>> Ron Taylor >>>>>> > > --------------000909090608050706000305--