Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5E3B104A4 for ; Wed, 11 Sep 2013 15:02:32 +0000 (UTC) Received: (qmail 82294 invoked by uid 500); 11 Sep 2013 15:02:29 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 82246 invoked by uid 500); 11 Sep 2013 15:02:29 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 82238 invoked by uid 99); 11 Sep 2013 15:02:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Sep 2013 15:02:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kevin.odell@cloudera.com designates 209.85.215.50 as permitted sender) Received: from [209.85.215.50] (HELO mail-la0-f50.google.com) (209.85.215.50) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Sep 2013 15:02:23 +0000 Received: by mail-la0-f50.google.com with SMTP id lv10so818398lab.23 for ; Wed, 11 Sep 2013 08:02:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=pdGhMcauAIt+y7YG0O9qvwKy8NODqyYWnIBOkDYYOHg=; b=aYsi7OvhCwR60MDSTu404EaoFaj5mmEbnrdel5i8ZKHIGf7VkHEDgwmEGS91cCJXtr n/G9hP+72lR2QfWLGEb363ILOI1pXEzLWJNXBJhuB0Nv05WM8mZpV14f8vpQYhYiz5z2 JTkuKGToJfEGULtOkWPYkhPVPUSLFfjAenpEbFI8BiSdTVE4s9AQZ9/wf6pnCVIA7FNw IoHj3R3yFwp2gm873DRwE0Cya1S1NAt7RGHfzLZsxHlyNvtjB9hzrBKT3B7EaDPJZjRm fBOqFt+9kxZVEQwc2ztz53Ubuf+c/c8VaM9hHNO2+P4nCNWkSB7PVPBs3hgMttoC0v7v zIIg== X-Gm-Message-State: ALoCoQnUOSs9nDhKHcJ5a9USvAoY8QNn4m3ZsjziKlDEVhR0rRj5YxrWlvmHxysOyZz1NF4fIODf MIME-Version: 1.0 X-Received: by 10.152.115.242 with SMTP id jr18mr1660424lab.40.1378911722315; Wed, 11 Sep 2013 08:02:02 -0700 (PDT) Received: by 10.112.33.242 with HTTP; Wed, 11 Sep 2013 08:02:02 -0700 (PDT) In-Reply-To: References: Date: Wed, 11 Sep 2013 11:02:02 -0400 Message-ID: Subject: Re: HBase Region Server crash if column size become to big From: "Kevin O'dell" To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a11c3327a62b20c04e61ce7f5 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3327a62b20c04e61ce7f5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I have not see the exact error, but if I recall correctly jobs will fail if the column is larger than 10MB and we have not raised the default setting(which I don't have in front of me) ? On Wed, Sep 11, 2013 at 10:53 AM, Michael Segel wrote: > Just out of curiosity... > > How wide are the columns? > > What's the region size? > > Does anyone know the error message you'll get if your row is wider than a > region? > > > On Sep 11, 2013, at 9:47 AM, John wrote: > > > sry, I mean 570000 columns, not rows > > > > > > 2013/9/11 John > > > >> thanks for all the answers! The only entry I got in the > >> "hbase-cmf-hbase1-REGIONSERVER-mydomain.org.log.out" log file after I > >> executing the get command in the hbase shell is this: > >> > >> 2013-09-11 16:38:56,175 WARN org.apache.hadoop.ipc.HBaseServer: > >> (operationTooLarge): {"processingtimems":3196,"client":" > 192.168.0.1:50629 > >> > ","timeRange":[0,9223372036854775807],"starttimems":1378910332920,"respon= sesize":108211303,"class":"HRegionServer","table":"P_SO","cacheBlocks":true= ,"families":{"myCf":["ALL"]},"row":"myRow","queuetimems":0,"method":"get","= totalColumns":1,"maxVersions":1} > >> > >> After this the RegionServer is down, nothing more. BTW I found out tha= t > >> the row should have ~570000 rows. The size should be arround ~70mb > >> > >> Thanks > >> > >> > >> > >> 2013/9/11 Bing Jiang > >> > >>> hi john. > >>> I think it is a fresh question. Could you print the log from the > >>> regionserver crashed ? > >>> On Sep 11, 2013 8:38 PM, "John" wrote: > >>> > >>>> Okay, I will take a look at the ColumnPaginationFilter. > >>>> > >>>> I tried to reproduce the error. I created a new table and add one ne= w > >>> row > >>>> with 250 000 columns, but everything works fine if I execute a get t= o > >>> the > >>>> table. The only difference to my original programm was that I have > added > >>>> the data directly throught the hbase java api and not with the map > >>> reduce > >>>> bulk load. Maybe that can be the reason? > >>>> > >>>> I wonder a little bit about the hdfs structure if I compare both > methods > >>>> (hbase api/bulk load). If I add the data through the hbase api there > is > >>> no > >>>> file in > >>> /hbase/MyTable/5faaf42997925e2f637d8d38c420862f/MyColumnFamily/*, > >>>> but if I use the bulk load method there is a file for every time I > >>> executed > >>>> a new bulk load: > >>>> > >>>> root@pc11:~/hadoop# hadoop fs -ls > >>>> /hbase/mytestTable/5faaf42997925e2f637d8d38c420862f/mycf > >>>> root@pc11:~/hadoop# hadoop fs -ls > >>>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/ > >>>> Found 2 items > >>>> -rw-r--r-- 1 root supergroup 118824462 2013-09-11 11:46 > >>>> > >>>> > >>> > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/28e919a0cc8a4= 592b7f2c09defaaea3a > >>>> -rw-r--r-- 1 root supergroup 158576842 2013-09-11 11:35 > >>>> > >>>> > >>> > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/35c5e6df64c04= d0a880ffe82593258b8 > >>>> > >>>> If I ececute a get operation in the hbase shell to my the "MyTable" > >>> table > >>>> if got the result: > >>>> > >>>> hbase(main):004:0> get 'mytestTable', 'sampleRowKey' > >>>> ... <-- all results > >>>> 250000 row(s) in 38.4440 seconds > >>>> > >>>> but if I try to get the results for my "bulkLoadTable" I got this (+ > the > >>>> region server crash): > >>>> > >>>> hbase(main):003:0> get 'bulkLoadTable', 'oneSpecificRowKey' > >>>> COLUMN CELL > >>>> > >>>> ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: > Failed > >>>> after attempts=3D7, exceptions: > >>>> Wed Sep 11 14:21:05 CEST 2013, > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.io.IOException= : > >>> Call > >>>> to pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 failed on local > >>>> exception: java.io.EOFException > >>>> Wed Sep 11 14:21:06 CEST 2013, > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > >>>> java.net.ConnectException: > >>>> Connection refused > >>>> Wed Sep 11 14:21:07 CEST 2013, > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > >>>> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This > >>> server > >>>> is in the failed servers list: > >>>> pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 > >>>> Wed Sep 11 14:21:08 CEST 2013, > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > >>>> java.net.ConnectException: > >>>> Connection refused > >>>> Wed Sep 11 14:21:10 CEST 2013, > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > >>>> java.net.ConnectException: > >>>> Connection refused > >>>> Wed Sep 11 14:21:12 CEST 2013, > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > >>>> java.net.ConnectException: > >>>> Connection refused > >>>> Wed Sep 11 14:21:16 CEST 2013, > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > >>>> java.net.ConnectException: > >>>> Connection refused > >>>> > >>>> > >>>> > >>>> 2013/9/11 Ted Yu > >>>> > >>>>> Take a look at > >>>>> > >>>> > >>> > http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/Colum= nPaginationFilter.html > >>>>> > >>>>> Cheers > >>>>> > >>>>> On Sep 11, 2013, at 4:42 AM, John > wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> thanks for your fast answer! with size becoming too big I mean I > >>> have > >>>> one > >>>>>> row with thousands of columns. For example: > >>>>>> > >>>>>> myrowkey1 -> column1, column2, column3 ... columnN > >>>>>> > >>>>>> What do you mean with "change the batch size"? I try to create a > >>> little > >>>>>> java test code to reproduce the problem. It will take a moment > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> 2013/9/11 Jean-Marc Spaggiari > >>>>>> > >>>>>>> Hi John, > >>>>>>> > >>>>>>> Just to be sure. What is " the size become too big"? The size of = a > >>>>> single > >>>>>>> column within this row? Or the number of columns? > >>>>>>> > >>>>>>> If it's the number of columns, you can change the batch size to g= et > >>>> less > >>>>>>> columns in a single call? Can you share the relevant piece of cod= e > >>>> doing > >>>>>>> the call? > >>>>>>> > >>>>>>> JM > >>>>>>> > >>>>>>> > >>>>>>> 2013/9/11 John > >>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I store a lot of columns for one row key and if the size become = to > >>>> big > >>>>>>> the > >>>>>>>> relevant Region Server crashs if I try to get or scan the row. F= or > >>>>>>> example > >>>>>>>> if I try to get the relevant row I got this error: > >>>>>>>> > >>>>>>>> 2013-09-11 12:46:43,696 WARN org.apache.hadoop.ipc.HBaseServer: > >>>>>>>> (operationTooLarge): {"processingtimems":3091,"client":" > >>>>>>> 192.168.0.34:52488 > >>>>>>>> ","ti$ > >>>>>>>> > >>>>>>>> If I try to load the relevant row via Apache Pig and the > >>> HBaseStorage > >>>>>>>> Loader (use the scan operation) I got this message and after tha= t > >>> the > >>>>>>>> Region Servers crashs: > >>>>>>>> > >>>>>>>> 2013-09-11 10:30:23,542 WARN org.apache.hadoop.ipc.HBaseServer: > >>>>>>>> (responseTooLarge): > >>>>>>>> {"processingtimems":1851,"call":"next(-588368116791418695, > >>>>>>>> 1), rpc version=3D1, client version=3D29,$ > >>>>>>>> > >>>>>>>> I'm using Cloudera 4.4.0 with 0.94.6-cdh4.4.0 > >>>>>>>> > >>>>>>>> Any clues? > >>>>>>>> > >>>>>>>> regards > >>>>>>> > >>>>> > >>>> > >>> > >> > >> > > The opinions expressed here are mine, while they may reflect a cognitive > thought, that is purely accidental. > Use at your own risk. > Michael Segel > michael_segel (AT) hotmail.com > > > > > > --=20 Kevin O'Dell Systems Engineer, Cloudera --001a11c3327a62b20c04e61ce7f5--