hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin O'dell" <kevin.od...@cloudera.com>
Subject Re: Table in Inconsistent State; Perpetually pending region server transitions while loading lot of data into Hbase via MR
Date Thu, 01 Nov 2012 15:35:12 GMT
Michael,

  I am not sure, I recommend it as a solid middle ground so that you have
room to scale in your cluster.  Once you get to 20GB+ from what I
understand there are some adverse performance issues.   It is the same as
recommending 2GB for HFilev1, it is a good middle ground or a 4 max.  With
that being said we have customer running 10GB region sizes on .90
successfully, but there are known kinks.  So it is still just a matter of
what works for you!

On Thu, Nov 1, 2012 at 9:50 AM, Michael Segel <michael_segel@hotmail.com>wrote:

> Just out of curiosity...
>
> What's the impact on having regions of 10GB or larger?
>
> What does that do to your footprint in memory and the time it takes to
> split or compact a region?
>
> -Mike
>
> On Nov 1, 2012, at 8:35 AM, Kevin O'dell <kevin.odell@cloudera.com> wrote:
>
> > Couple thoughts(it is still early here so bear with me):
> >
> > Did you presplit your table?
> >
> > You are on .92, might as well take advantage of HFilev2 and use 10GB
> region
> > sizes
> >
> > Loading over MR, I am assuming puts?  Did you tune your memstore and Hlog
> > size?
> >
> > You aren't using a different client version or something strange like
> that
> > are you?
> >
> > You can't close hlog messages seem to indicate an inability to talk to
> > HDFS.  Did you have connection issues there?
> >
> >
> >
> > On Thu, Nov 1, 2012 at 5:20 AM, ramkrishna vasudevan <
> > ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> >> Can you try restarting the cluster i mean the master and RS.
> >> Also if this things persists try to clear the zk data and restart.
> >>
> >> Regards
> >> Ram
> >>
> >> On Thu, Nov 1, 2012 at 2:46 PM, Cheng Su <scarcer.cn@gmail.com> wrote:
> >>
> >>> Sorry, my mistake. Ignore about the "max store size of a single CF"
> >> please.
> >>>
> >>> m(_ _)m
> >>>
> >>> On Thu, Nov 1, 2012 at 4:43 PM, Ameya Kantikar <ameya@groupon.com>
> >> wrote:
> >>>> Thanks Cheng. I'll try increasing my max region size limit.
> >>>>
> >>>> However I am not clear with this math:
> >>>>
> >>>> "Since you set the max file size to 2G, you can only store 2XN G data
> >>>> into a single CF."
> >>>>
> >>>> Why is that? My assumption is, even though single region can only be
2
> >>> GB,
> >>>> I can still have hundreds of regions, and hence can store 200GB+ data
> >> in
> >>>> single CF on my 10 machine cluster.
> >>>>
> >>>> Ameya
> >>>>
> >>>>
> >>>> On Thu, Nov 1, 2012 at 1:19 AM, Cheng Su <scarcer.cn@gmail.com>
> wrote:
> >>>>
> >>>>> I met same problem these days.
> >>>>> I'm not very sure the error log is exactly same, but I do have the
> >>>>> same exception
> >>>>>
> >>>>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> >>>>> Failed 1 action: NotServingRegionException: 1 time, servers with
> >>>>> issues: smartdeals-hbase8-snc1.snc1:60020,
> >>>>>
> >>>>> and the table is also neither enabled nor disabled, thus I can't
drop
> >>> it.
> >>>>>
> >>>>> I guess the problem is the total store size.
> >>>>> How many region server do you have?
> >>>>> Since you set the max file size to 2G, you can only store 2XN G
data
> >>>>> into a single CF.
> >>>>> (N is the number of your region servers)
> >>>>>
> >>>>> You might want to increase the max file size or region servers.
> >>>>>
> >>>>> On Thu, Nov 1, 2012 at 3:29 PM, Ameya Kantikar <ameya@groupon.com>
> >>> wrote:
> >>>>>> One more thing, the Hbase table in question is neither enabled,
nor
> >>>>>> disabled:
> >>>>>>
> >>>>>> hbase(main):006:0> is_disabled 'userTable1'
> >>>>>> false
> >>>>>>
> >>>>>> 0 row(s) in 0.0040 seconds
> >>>>>>
> >>>>>> hbase(main):007:0> is_enabled 'userTable1'
> >>>>>> false
> >>>>>>
> >>>>>> 0 row(s) in 0.0040 seconds
> >>>>>>
> >>>>>> Ameya
> >>>>>>
> >>>>>> On Thu, Nov 1, 2012 at 12:02 AM, Ameya Kantikar <ameya@groupon.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I am trying to load lot of data (around 1.5 TB) into a single
Hbase
> >>>>> table.
> >>>>>>> I have setup region size at 2 GB. I also
> >>>>>>> set hbase.regionserver.handler.count at 30.
> >>>>>>>
> >>>>>>> When I start loading data via MR, after a while, tasks start
> >> failing
> >>>>> with
> >>>>>>> following error:
> >>>>>>>
> >>>>>>>
> >> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> >>>>> Failed 1 action: NotServingRegionException: 1 time, servers with
> >> issues:
> >>>>> smartdeals-hbase8-snc1.snc1:60020,
> >>>>>>>      at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1641)
> >>>>>>>      at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409)
> >>>>>>>      at
> >>>>> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943)
> >>>>>>>      at
> >> org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:820)
> >>>>>>>      at org.apache.hadoop.hbase.client.HTable.put(HTable.java:795)
> >>>>>>>      at
> >>>>>
> >>>
> >>
> com..mr.hbase.LoadUserCacheInHbase$TokenizerMapper.map(LoadUserCacheInHbase.java:83)
> >>>>>>>      at
> >>>>>
> >>>
> >>
> com..mr.hbase.LoadUserCacheInHbase$TokenizerMapper.map(LoadUserCacheInHbase.java:33)
> >>>>>>>      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> >>>>>>>      at
> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
> >>>>>>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.j
> >>>>>>>
> >>>>>>> On the hbase8 machine I see following in logs:
> >>>>>>>
> >>>>>>> ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Error
while
> >>>>> syncing, requesting close of hlog
> >>>>>>> java.io.IOException: Reflection
> >>>>>>>        at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:230)
> >>>>>>>        at
> >>>>> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1109)
> >>>>>>>        at
> >>>>> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1213)
> >>>>>>>        at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1071)
> >>>>>>>        at java.lang.Thread.run(Thread.java:662)
> >>>>>>> Caused by: java.lang.reflect.InvocationTargetException
> >>>>>>>        at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown
> >>> Source)
> >>>>>>>        at
> >>>>>
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>>>>>>        at java.lang.reflect.Method.invoke(Method.java:597)
> >>>>>>>        at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:228)
> >>>>>>>        ... 4 more
> >>>>>>>
> >>>>>>>
> >>>>>>> I only have 15 map tasks each on a 10 machine cluster (total
150
> >> map
> >>>>> tasks entering data into Hbase table).
> >>>>>>>
> >>>>>>> Further, I see 2-3 regions perpetually under "Regions in
> >> Transitions"
> >>>>> in Hbase master web console as follows:
> >>>>>>>
> >>>>>>> 8dcb3edee4e43faa3dbeac2db4f12274userTable1,
> >> pookydearest@hotmail.com
> >>> ,1351728961461.8dcb3edee4e43faa3dbeac2db4f12274.
> >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:39:57 UTC 2012 (409s ago),
> >>>>> server=smartdeals-hbase1-snc1.snc1,60020,1351751785514
> >>>>>>>
> >>>>>>>
> >>>>>>> bb91fd0c855e60dd4159e0ad3fd52cdauserTable1,m_skaare@yahoo.com
> >>> ,1351728968936.bb91fd0c855e60dd4159e0ad3fd52cda.
> >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:42:17 UTC 2012 (269s ago),
> >>>>> server=smartdeals-hbase3-snc1.snc1,60020,1351747466016
> >>>>>>> bd44334a11464baf85013c97d673e600userTable1,tammikilgore@gmail.com
> >>> ,1351728952308.bd44334a11464baf85013c97d673e600.
> >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:42:17 UTC 2012 (269s ago),
> >>>>> server=smartdeals-hbase1-snc1.snc1,60020,1351751785514
> >>>>>>> ed1f7e7908fc232f10d78dd1e796a5d7userTable1,jwoodel@triad.rr.com
> >>> ,1351728971232.ed1f7e7908fc232f10d78dd1e796a5d7.
> >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:37:37 UTC 2012 (549s ago),
> >>>>> server=smartdeals-hbase3-snc1.snc1,60020,1351747466016
> >>>>>>>
> >>>>>>>
> >>>>>>> Note these are not going away even after 30 minutes.
> >>>>>>>
> >>>>>>> Further after running
> >>>>>>>
> >>>>>>> hbase hbck -summary I get following:
> >>>>>>>
> >>>>>>> Summary:
> >>>>>>>  -ROOT- is okay.
> >>>>>>>    Number of regions: 1
> >>>>>>>    Deployed on:  smartdeals-hbase7-snc1.snc1,60020,1351747458782
> >>>>>>>  .META. is okay.
> >>>>>>>    Number of regions: 1
> >>>>>>>    Deployed on:  smartdeals-hbase7-snc1.snc1,60020,1351747458782
> >>>>>>>  test1 is okay.
> >>>>>>>    Number of regions: 1
> >>>>>>>    Deployed on:  smartdeals-hbase2-snc1.snc1,60020,1351747457308
> >>>>>>>  userTable1 is okay.
> >>>>>>>    Number of regions: 32
> >>>>>>>    Deployed on:  smartdeals-hbase10-snc1.snc1,60020,1351747456776
> >>>>> smartdeals-hbase2-snc1.snc1,60020,1351747457308
> >>>>> smartdeals-hbase4-snc1.snc1,60020,1351747455571
> >>>>> smartdeals-hbase5-snc1.snc1,60020,1351747458579
> >>>>> smartdeals-hbase6-snc1.snc1,60020,1351747458186
> >>>>> smartdeals-hbase7-snc1.snc1,60020,1351747458782
> >>>>> smartdeals-hbase8-snc1.snc1,60020,1351747459112
> >>>>> smartdeals-hbase9-snc1.snc1,60020,1351747455106
> >>>>>>> 24 inconsistencies detected.
> >>>>>>> Status: INCONSISTENT
> >>>>>>>
> >>>>>>> In master logs I am seeing following error:
> >>>>>>>
> >>>>>>> ERROR org.apache.hadoop.hbase.master.AssignmentManager:
Failed
> >>>>> assignment in: smartdeals-hbase3-snc1.snc1,60020,1351747466016 due
to
> >>>>>>>
> >>>>>
> >> org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException:
> >>>>> Received:OPEN for the region:userTable1,m_skaare@yahoo.com
> >>> ,1351728968936.bb91fd0c855e60dd4159e0ad3fd52cda.
> >>>>> ,which we are already trying to OPEN.
> >>>>>>> at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkIfRegionInTransition(HRegionServer.java:2499)
> >>>>>       at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2457)
> >>>>>       at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
> >>>>>   at
> >>>>>
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>>>>       at java.lang.reflect.Method.invoke(Method.java:597)      
 at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
> >>>>>       at
> >>>>>
> >>>
> >>
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1336)
> >>>>>>>
> >>>>>>>
> >>>>>>> Am I missing something? How do I recover from this? How
do I load
> >> lot
> >>>>> of data via MR into Hbase Tables?
> >>>>>>>
> >>>>>>>
> >>>>>>> I am running under following setup:
> >>>>>>>
> >>>>>>> hadoop:2.0.0-cdh4.0.1
> >>>>>>>
> >>>>>>> hbase: 0.92.1-cdh4.0.1, r
> >>>>>>>
> >>>>>>>
> >>>>>>> Would greatly appreciate any help.
> >>>>>>>
> >>>>>>>
> >>>>>>> Ameya
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>> Regards,
> >>>>> Cheng Su
> >>>>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Regards,
> >>> Cheng Su
> >>>
> >>
> >
> >
> >
> > --
> > Kevin O'Dell
> > Customer Operations Engineer, Cloudera
>
>


-- 
Kevin O'Dell
Customer Operations Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message