hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sasha Dolgy <sdo...@gmail.com>
Subject Re: HRegionServer: Failed openScanner
Date Fri, 15 May 2009 18:58:22 GMT
Hi Andy,
I've sent you an email with a link to a tar file with the logs.  To be
honest, for the most part this is default out of the box.  To this point
this is the first problem with over 150k writes to HBase.  After i stopped /
started HBase again everything is going fine...

I haven't looked at the troubleshooting page yet, because well, i'm not
quite sure what to trouble shoot.  Finding it hard to identify an actual
problem....other then seeing stack traces and it not working.

-sd

On Fri, May 15, 2009 at 7:54 PM, Andrew Purtell <apurtell@apache.org> wrote:

> This is almost surely resource overcommitment as cause: CPU and/or memory,
> leading to thread starvation. We observe the JVM scheduler is unfair at high
> load, and swap, especially if JVM heap is paged out when a GC cycle happens,
> can also be similarly deadly. Give other details in this thread, I suspect
> swap. What JVM options are you running with? Have you looked at the GC
> related tips on the troubleshooting page up on the wiki?
> http://wiki.apache.org/hadoop/Hbase/Troubleshooting
>
> Best regards,
>
>   - Andy
>
>
>
>
> ________________________________
> From: Sasha Dolgy <sdolgy@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Friday, May 15, 2009 11:38:01 AM
> Subject: Re: HRegionServer: Failed openScanner
>
> In the region server logs I see messages from the 14th:
> 2009-05-14 22:47:28,840 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> starting  compaction on region syslog,,1242260881586
> 2009-05-14 22:47:43,976 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region syslog,,1242260881586 in 15sec
>
> then no log entries until the 15th when the error happens:
>
> 2009-05-15 00:55:51,568 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 189138ms, ten times longer than scheduled: 10000
> 2009-05-15 00:55:52,334 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 188348ms, ten times longer than scheduled: 3000
> 2009-05-15 00:55:53,090 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
> master for 189261 milliseconds - retrying
> 2009-05-15 00:55:56,789 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> MSG_CALL_SERVER_STARTUP:
> safeMode=false
> 2009-05-15 00:55:57,249 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
> org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
>        at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2076)
>        at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1710)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
>
>
>
> On Fri, May 15, 2009 at 7:32 PM, Sasha Dolgy <sdolgy@gmail.com> wrote:
>
> > Ok, i'll go take a look.  They are both on the local server so network
> > issues shouldn't be a cause.  Cheers though, i'll go look at the JIRA
> link.
> > If I find anything else i'll post here.
> > thanks
> > -sd
> >
> > On Fri, May 15, 2009 at 6:18 PM, Andrew Purtell <apurtell@apache.org
> >wrote:
> >
> >> The region server hosting META could not communicate with the master for
> a
> >> very long time. Some kind of network issue? Any entries in the region
> server
> >> logs above this one
> >>
> >> > 2009-05-15 00:55:53,090 WARN
> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report
> to
> >> > master for 189261 milliseconds - retrying
> >>
> >> which may be relevant? Anything about sleeping too long?
> >>
> >> Related, there were some bugs that I am aware of preventing recovery if
> >> META in particular goes away but they should be fixed for 0.20 as of
> >> https://issues.apache.org/jira/browse/HBASE-1362 .
> >>
> >>   - Andy
> >>
> >
>
>
>
>
>



-- 
Sasha Dolgy
sasha.dolgy@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message