Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5159A10086 for ; Wed, 11 Sep 2013 13:41:41 +0000 (UTC) Received: (qmail 53114 invoked by uid 500); 11 Sep 2013 13:41:38 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 52742 invoked by uid 500); 11 Sep 2013 13:41:34 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 52720 invoked by uid 99); 11 Sep 2013 13:41:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Sep 2013 13:41:33 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jiangbinglover@gmail.com designates 209.85.128.180 as permitted sender) Received: from [209.85.128.180] (HELO mail-ve0-f180.google.com) (209.85.128.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Sep 2013 13:41:27 +0000 Received: by mail-ve0-f180.google.com with SMTP id jz11so5821784veb.11 for ; Wed, 11 Sep 2013 06:41:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=VnLgaM06Z+axQjPZu64yvW+GuQwIU2QsrvcX/tdA2Bc=; b=w1AulWcch5OKA/hB9kbWljv/shaOx6YQDKbMB6Sp1jLW6Hyjum9lf6gWOXOkllpwve PdUsjuWx1OM5WDWYbUT9EtO1zswpM0hNnSTh76Pddrna0nJ58vZJAZtBa/ijn8CdNwCQ uHKoo02GfEN2nFPevolQwqR6eqS2Q9kz/Peoyopl159RS5JtpICPi+mpWvQESN1DcWRo JUXP3kSpiA+XFOf1W4ExfPzTG5BU6qzgh65lbN0cMPAT7WH4KboYJ1ibL4gFalV+8Op7 nUpUb3OCgH9Rt2RYQOTx4ziBW3G2md7OLtHKR2MQPzCJYz2wsMRnMzxMX5XUsIeZt2OQ eVtQ== MIME-Version: 1.0 X-Received: by 10.221.56.194 with SMTP id wd2mr1334905vcb.7.1378906866319; Wed, 11 Sep 2013 06:41:06 -0700 (PDT) Received: by 10.220.148.193 with HTTP; Wed, 11 Sep 2013 06:41:05 -0700 (PDT) Received: by 10.220.148.193 with HTTP; Wed, 11 Sep 2013 06:41:05 -0700 (PDT) In-Reply-To: References: Date: Wed, 11 Sep 2013 21:41:05 +0800 Message-ID: Subject: Re: HBase Region Server crash if column size become to big From: Bing Jiang To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001a11337ef6f2083e04e61bc581 X-Virus-Checked: Checked by ClamAV on apache.org --001a11337ef6f2083e04e61bc581 Content-Type: text/plain; charset=ISO-8859-1 hi john. I think it is a fresh question. Could you print the log from the regionserver crashed ? On Sep 11, 2013 8:38 PM, "John" wrote: > Okay, I will take a look at the ColumnPaginationFilter. > > I tried to reproduce the error. I created a new table and add one new row > with 250 000 columns, but everything works fine if I execute a get to the > table. The only difference to my original programm was that I have added > the data directly throught the hbase java api and not with the map reduce > bulk load. Maybe that can be the reason? > > I wonder a little bit about the hdfs structure if I compare both methods > (hbase api/bulk load). If I add the data through the hbase api there is no > file in /hbase/MyTable/5faaf42997925e2f637d8d38c420862f/MyColumnFamily/*, > but if I use the bulk load method there is a file for every time I executed > a new bulk load: > > root@pc11:~/hadoop# hadoop fs -ls > /hbase/mytestTable/5faaf42997925e2f637d8d38c420862f/mycf > root@pc11:~/hadoop# hadoop fs -ls > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/ > Found 2 items > -rw-r--r-- 1 root supergroup 118824462 2013-09-11 11:46 > > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/28e919a0cc8a4592b7f2c09defaaea3a > -rw-r--r-- 1 root supergroup 158576842 2013-09-11 11:35 > > /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/35c5e6df64c04d0a880ffe82593258b8 > > If I ececute a get operation in the hbase shell to my the "MyTable" table > if got the result: > > hbase(main):004:0> get 'mytestTable', 'sampleRowKey' > ... <-- all results > 250000 row(s) in 38.4440 seconds > > but if I try to get the results for my "bulkLoadTable" I got this (+ the > region server crash): > > hbase(main):003:0> get 'bulkLoadTable', 'oneSpecificRowKey' > COLUMN CELL > > ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed > after attempts=7, exceptions: > Wed Sep 11 14:21:05 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, java.io.IOException: Call > to pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 failed on local > exception: java.io.EOFException > Wed Sep 11 14:21:06 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:07 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server > is in the failed servers list: > pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 > Wed Sep 11 14:21:08 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:10 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:12 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > java.net.ConnectException: > Connection refused > Wed Sep 11 14:21:16 CEST 2013, > org.apache.hadoop.hbase.client.HTable$3@adc4d8f, > java.net.ConnectException: > Connection refused > > > > 2013/9/11 Ted Yu > > > Take a look at > > > http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.html > > > > Cheers > > > > On Sep 11, 2013, at 4:42 AM, John wrote: > > > > > Hi, > > > > > > thanks for your fast answer! with size becoming too big I mean I have > one > > > row with thousands of columns. For example: > > > > > > myrowkey1 -> column1, column2, column3 ... columnN > > > > > > What do you mean with "change the batch size"? I try to create a little > > > java test code to reproduce the problem. It will take a moment > > > > > > > > > > > > > > > 2013/9/11 Jean-Marc Spaggiari > > > > > >> Hi John, > > >> > > >> Just to be sure. What is " the size become too big"? The size of a > > single > > >> column within this row? Or the number of columns? > > >> > > >> If it's the number of columns, you can change the batch size to get > less > > >> columns in a single call? Can you share the relevant piece of code > doing > > >> the call? > > >> > > >> JM > > >> > > >> > > >> 2013/9/11 John > > >> > > >>> Hi, > > >>> > > >>> I store a lot of columns for one row key and if the size become to > big > > >> the > > >>> relevant Region Server crashs if I try to get or scan the row. For > > >> example > > >>> if I try to get the relevant row I got this error: > > >>> > > >>> 2013-09-11 12:46:43,696 WARN org.apache.hadoop.ipc.HBaseServer: > > >>> (operationTooLarge): {"processingtimems":3091,"client":" > > >> 192.168.0.34:52488 > > >>> ","ti$ > > >>> > > >>> If I try to load the relevant row via Apache Pig and the HBaseStorage > > >>> Loader (use the scan operation) I got this message and after that the > > >>> Region Servers crashs: > > >>> > > >>> 2013-09-11 10:30:23,542 WARN org.apache.hadoop.ipc.HBaseServer: > > >>> (responseTooLarge): > > >>> {"processingtimems":1851,"call":"next(-588368116791418695, > > >>> 1), rpc version=1, client version=29,$ > > >>> > > >>> I'm using Cloudera 4.4.0 with 0.94.6-cdh4.4.0 > > >>> > > >>> Any clues? > > >>> > > >>> regards > > >> > > > --001a11337ef6f2083e04e61bc581--