hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin O'dell" <kevin.od...@cloudera.com>
Subject Re: HBase Region Server crash if column size become to big
Date Wed, 11 Sep 2013 15:15:04 GMT
Hey John,

  You can try upping - hbase.client.keyvalue.maxsize from 10MB to 500MB,
BUT it is there for a reason :) The response coming back in 169MB, have you
tried changing the batch size that JM referred to earlier?


On Wed, Sep 11, 2013 at 11:08 AM, John <johnnyenglish739@gmail.com> wrote:

> @michael: What do you mean with wide? The size of one column? The size of
> one row is round about ~200 character. What is the region size?
>
> @ kevin: what option do I have to change?
>
> Finaly, I was able to create a little java programm to reconstruct the
> error. I wrote a little Java programm that creates a lot of columns for one
> rowkey. You can find the programm here: http://pastebin.com/TFJRtCEg
>
> After I have created 600000 columns and executing this command in the hbase
> shell:
>
> get 'mytestTable', 'sampleRowKey'
>
> The RegionServer crash again with the same error:
>
> 2013-09-11 16:58:26,546 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooLarge): {"processingtimems":2650,"client":"192.168.0.1:34944
>
> ","timeRange":[0,9223372036854775807],"starttimems":1378911503836,"responsesize":177600006,"class":"HRegionServer","table":"mytestTable","cacheBlocks":true,"families":{"mycf":["ALL"]},"row":"sampleRowKey","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1}
>
> Mabye someone can test it?
>
> thanks
>
> 2013/9/11 Kevin O'dell <kevin.odell@cloudera.com>
>
> > I have not see the exact error, but if I recall correctly jobs will fail
> if
> > the column is larger than 10MB and we have not raised the default
> > setting(which I don't have in front of me) ?
> >
> >
> > On Wed, Sep 11, 2013 at 10:53 AM, Michael Segel
> > <michael_segel@hotmail.com>wrote:
> >
> > > Just out of curiosity...
> > >
> > > How wide are the columns?
> > >
> > > What's the region size?
> > >
> > > Does anyone know the error message you'll get if your row is wider
> than a
> > > region?
> > >
> > >
> > > On Sep 11, 2013, at 9:47 AM, John <johnnyenglish739@gmail.com> wrote:
> > >
> > > > sry, I mean 570000 columns, not rows
> > > >
> > > >
> > > > 2013/9/11 John <johnnyenglish739@gmail.com>
> > > >
> > > >> thanks for all the answers! The only entry I got in the
> > > >> "hbase-cmf-hbase1-REGIONSERVER-mydomain.org.log.out" log file after
> I
> > > >> executing the get command in the hbase shell is this:
> > > >>
> > > >> 2013-09-11 16:38:56,175 WARN org.apache.hadoop.ipc.HBaseServer:
> > > >> (operationTooLarge): {"processingtimems":3196,"client":"
> > > 192.168.0.1:50629
> > > >>
> > >
> >
> ","timeRange":[0,9223372036854775807],"starttimems":1378910332920,"responsesize":108211303,"class":"HRegionServer","table":"P_SO","cacheBlocks":true,"families":{"myCf":["ALL"]},"row":"myRow","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1}
> > > >>
> > > >> After this the RegionServer is down, nothing more. BTW I found out
> > that
> > > >> the row should have ~570000 rows. The size should be arround ~70mb
> > > >>
> > > >> Thanks
> > > >>
> > > >>
> > > >>
> > > >> 2013/9/11 Bing Jiang <jiangbinglover@gmail.com>
> > > >>
> > > >>> hi john.
> > > >>> I think it is a fresh question. Could you print the log from the
> > > >>> regionserver crashed ?
> > > >>> On Sep 11, 2013 8:38 PM, "John" <johnnyenglish739@gmail.com>
> wrote:
> > > >>>
> > > >>>> Okay, I will take a look at the ColumnPaginationFilter.
> > > >>>>
> > > >>>> I tried to reproduce the error. I created a new table and
add one
> > new
> > > >>> row
> > > >>>> with 250 000 columns, but everything works fine if I execute
a get
> > to
> > > >>> the
> > > >>>> table. The only difference to my original programm was that
I have
> > > added
> > > >>>> the data directly throught the hbase java api and not with
the map
> > > >>> reduce
> > > >>>> bulk load. Maybe that can be the reason?
> > > >>>>
> > > >>>> I wonder a little bit about the hdfs structure if I compare
both
> > > methods
> > > >>>> (hbase api/bulk load). If I add the data through the hbase
api
> there
> > > is
> > > >>> no
> > > >>>> file in
> > > >>> /hbase/MyTable/5faaf42997925e2f637d8d38c420862f/MyColumnFamily/*,
> > > >>>> but if I use the bulk load method there is a file for every
time I
> > > >>> executed
> > > >>>> a new bulk load:
> > > >>>>
> > > >>>> root@pc11:~/hadoop# hadoop fs -ls
> > > >>>> /hbase/mytestTable/5faaf42997925e2f637d8d38c420862f/mycf
> > > >>>> root@pc11:~/hadoop# hadoop fs -ls
> > > >>>> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/
> > > >>>> Found 2 items
> > > >>>> -rw-r--r--   1 root supergroup  118824462 2013-09-11 11:46
> > > >>>>
> > > >>>>
> > > >>>
> > >
> >
> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/28e919a0cc8a4592b7f2c09defaaea3a
> > > >>>> -rw-r--r--   1 root supergroup  158576842 2013-09-11 11:35
> > > >>>>
> > > >>>>
> > > >>>
> > >
> >
> /hbase/bulkLoadTable/f95294bd3c8651a7bbdf9fac27f8961a/mycf2/35c5e6df64c04d0a880ffe82593258b8
> > > >>>>
> > > >>>> If I ececute a get operation in the hbase shell to my the
> "MyTable"
> > > >>> table
> > > >>>> if got the result:
> > > >>>>
> > > >>>> hbase(main):004:0> get 'mytestTable', 'sampleRowKey'
> > > >>>> ... <-- all results
> > > >>>> 250000 row(s) in 38.4440 seconds
> > > >>>>
> > > >>>> but if I try to get the results for my "bulkLoadTable" I got
this
> (+
> > > the
> > > >>>> region server crash):
> > > >>>>
> > > >>>> hbase(main):003:0> get 'bulkLoadTable', 'oneSpecificRowKey'
> > > >>>> COLUMN                          CELL
> > > >>>>
> > > >>>> ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > > Failed
> > > >>>> after attempts=7, exceptions:
> > > >>>> Wed Sep 11 14:21:05 CEST 2013,
> > > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
> > java.io.IOException:
> > > >>> Call
> > > >>>> to pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020 failed
on
> > local
> > > >>>> exception: java.io.EOFException
> > > >>>> Wed Sep 11 14:21:06 CEST 2013,
> > > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
> > > >>>> java.net.ConnectException:
> > > >>>> Connection refused
> > > >>>> Wed Sep 11 14:21:07 CEST 2013,
> > > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
> > > >>>> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException:
> This
> > > >>> server
> > > >>>> is in the failed servers list:
> > > >>>> pc17.pool.ifis.uni-luebeck.de/141.83.150.97:60020
> > > >>>> Wed Sep 11 14:21:08 CEST 2013,
> > > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
> > > >>>> java.net.ConnectException:
> > > >>>> Connection refused
> > > >>>> Wed Sep 11 14:21:10 CEST 2013,
> > > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
> > > >>>> java.net.ConnectException:
> > > >>>> Connection refused
> > > >>>> Wed Sep 11 14:21:12 CEST 2013,
> > > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
> > > >>>> java.net.ConnectException:
> > > >>>> Connection refused
> > > >>>> Wed Sep 11 14:21:16 CEST 2013,
> > > >>>> org.apache.hadoop.hbase.client.HTable$3@adc4d8f,
> > > >>>> java.net.ConnectException:
> > > >>>> Connection refused
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> 2013/9/11 Ted Yu <yuzhihong@gmail.com>
> > > >>>>
> > > >>>>> Take a look at
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.html
> > > >>>>>
> > > >>>>> Cheers
> > > >>>>>
> > > >>>>> On Sep 11, 2013, at 4:42 AM, John <johnnyenglish739@gmail.com>
> > > wrote:
> > > >>>>>
> > > >>>>>> Hi,
> > > >>>>>>
> > > >>>>>> thanks for your fast answer! with size becoming too
big I mean I
> > > >>> have
> > > >>>> one
> > > >>>>>> row with thousands of columns. For example:
> > > >>>>>>
> > > >>>>>> myrowkey1 -> column1, column2, column3 ... columnN
> > > >>>>>>
> > > >>>>>> What do you mean with "change the batch size"? I try
to create a
> > > >>> little
> > > >>>>>> java test code to reproduce the problem. It will take
a moment
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> 2013/9/11 Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> > > >>>>>>
> > > >>>>>>> Hi John,
> > > >>>>>>>
> > > >>>>>>> Just to be sure. What is " the size become too
big"? The size
> of
> > a
> > > >>>>> single
> > > >>>>>>> column within this row? Or the number of columns?
> > > >>>>>>>
> > > >>>>>>> If it's the number of columns, you can change
the batch size to
> > get
> > > >>>> less
> > > >>>>>>> columns in a single call? Can you share the relevant
piece of
> > code
> > > >>>> doing
> > > >>>>>>> the call?
> > > >>>>>>>
> > > >>>>>>> JM
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> 2013/9/11 John <johnnyenglish739@gmail.com>
> > > >>>>>>>
> > > >>>>>>>> Hi,
> > > >>>>>>>>
> > > >>>>>>>> I store a lot of columns for one row key and
if the size
> become
> > to
> > > >>>> big
> > > >>>>>>> the
> > > >>>>>>>> relevant Region Server crashs if I try to
get or scan the row.
> > For
> > > >>>>>>> example
> > > >>>>>>>> if I try to get the relevant row I got this
error:
> > > >>>>>>>>
> > > >>>>>>>> 2013-09-11 12:46:43,696 WARN
> org.apache.hadoop.ipc.HBaseServer:
> > > >>>>>>>> (operationTooLarge): {"processingtimems":3091,"client":"
> > > >>>>>>> 192.168.0.34:52488
> > > >>>>>>>> ","ti$
> > > >>>>>>>>
> > > >>>>>>>> If I try to load the relevant row via Apache
Pig and the
> > > >>> HBaseStorage
> > > >>>>>>>> Loader (use the scan operation) I got this
message and after
> > that
> > > >>> the
> > > >>>>>>>> Region Servers crashs:
> > > >>>>>>>>
> > > >>>>>>>> 2013-09-11 10:30:23,542 WARN
> org.apache.hadoop.ipc.HBaseServer:
> > > >>>>>>>> (responseTooLarge):
> > > >>>>>>>> {"processingtimems":1851,"call":"next(-588368116791418695,
> > > >>>>>>>> 1), rpc version=1, client version=29,$
> > > >>>>>>>>
> > > >>>>>>>> I'm using Cloudera 4.4.0 with 0.94.6-cdh4.4.0
> > > >>>>>>>>
> > > >>>>>>>> Any clues?
> > > >>>>>>>>
> > > >>>>>>>> regards
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > > >>
> > >
> > > The opinions expressed here are mine, while they may reflect a
> cognitive
> > > thought, that is purely accidental.
> > > Use at your own risk.
> > > Michael Segel
> > > michael_segel (AT) hotmail.com
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Kevin O'Dell
> > Systems Engineer, Cloudera
> >
>



-- 
Kevin O'Dell
Systems Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message