hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vimal Jain <vkj...@gmail.com>
Subject Re: Exceptions in log files of Hadoop and Hbase
Date Tue, 13 Aug 2013 15:37:00 GMT
Hi Jean-Marc,
Thanks for your reply.
I have one node cluster(pseudo distributed mode) , so 192.168.20.30 is the
only server which hosts all 6 processes (
namenode,datanode,secondarynamenode,Hmaster,HRegion and Zookeeper).

At time of this problem, i had given following memory to these processes (
I had total 4 GB memory on this box):

NameNode,DataNode,SecondaryNamenode - 1 GB ( default)
HMaster,Zookeeper - 512MB
HRegion-1.5 GB

So total memory allocation is 5.5 GB and it exceeds physical memory.( I am
planning to increase RAM on my system to 8GB and change memory allocation
as 2 GB to HRegion and 1 GB to other 5 ).
So could that be a problem ?
I am talking about memory stuff as i see following in Hmaster log.

*org.apache.hadoop.hbase.util.Sleeper: We slept 1266874891120ms instead of
10000ms, this is likely due to a long garbage collecting pause and it's
usually bad*
*
*



On Tue, Aug 13, 2013 at 6:42 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Vimal,
>
> What was your cluster doing at that time? Was it very busy? Looks like one
> server (192.168.20.30 <http://192.168.20.30:50010>) went so busy that it
> failed to report active and closed.
>
> JM
>
> 2013/8/13 Vimal Jain <vkjk89@gmail.com>
>
> > Hi,
> > I am using Hadoop and Hbase in pseudo distributed mode.
> > I am using Hadoop version - 1.1.2  and Hbase version - 0.94.7
> >
> > Recently i found some exception in hadoop and hbase logs.
> > I am not sure what has caused this.
> > Requesting you to please help here.
> >
> >
> > *Exception in Master log :*
> > 2013-07-31 15:51:04,694 WARN org.apache.hadoop.hbase.util.
> > Sleeper: We slept 1266874891120ms instead of 10000ms, this is likely due
> to
> > a long garbage c
> > ollecting pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-07-31 15:51:04,798 WARN
> org.apache.hadoop.hbase.master.CatalogJanitor:
> > Failed scan of catalog table
> > org.apache.hadoop.hbase.client.ScannerTimeoutException: 82253ms passed
> > since the last invocation, timeout is currently set to 60000
> >     at
> > org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:283)
> >     at
> > org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:727)
> >     at
> > org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:184)
> >     at
> > org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:169)
> >     at
> >
> >
> org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:123)
> >     at
> >
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:134)
> >     at
> >
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:92)
> >     at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
> >     at java.lang.Thread.run(Thread.java:662)
> > Caused by: org.apache.hadoop.hbase.UnknownScannerException:
> > org.apache.hadoop.hbase.UnknownScannerException: Name:
> -8839286818925700393
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2544)
> >     at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
> >
> >     at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >     at java.lang.reflect.Method.invoke(Method.java:597)
> >     at
> >
> >
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> >     at
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> >
> >     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >     at
> >
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> >     at
> >
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> >     at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> >     at
> >
> >
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)
> >     at
> >
> >
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:143)
> >     at
> >
> >
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:42)
> >     at
> >
> >
> org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:164)
> >     at
> > org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:274)
> >     ... 8 more
> > 2013-07-31 15:54:42,526 DEBUG
> org.apache.hadoop.hbase.client.ClientScanner:
> > Creating scanner over .META. starting at key ''
> > 2013-07-31 15:54:42,526 DEBUG
> org.apache.hadoop.hbase.client.ClientScanner:
> > Advancing internal scanner to startKey at ''
> > 2013-07-31 15:54:42,531 DEBUG
> org.apache.hadoop.hbase.client.ClientScanner:
> > Finished with scanning at {NAME => '.META.,,1', STARTKEY => '', ENDKEY =>
> > '', ENCODED => 1028785192,}
> > 2013-07-31 15:54:42,532 DEBUG
> > org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 5 catalog row(s)
> and
> > gc'd 0 unreferenced parent region(s)
> > 2013-07-31 15:54:42,751 INFO org.apache.hadoop.hbase.master.LoadBalancer:
> > Skipping load balancing because balanced cluster; servers=1 regions=5
> > averag
> > e=5.0 mostloaded=5 leastloaded=5
> >
> > 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 25771ms instead of 1000ms, this is likely due to a long garbage
> collecting
> >  pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 30091ms instead of 1000ms, this is likely due to a long garbage
> collecting
> >  pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 28613ms instead of 10000ms, this is likely due to a long garbage
> collectin
> > g pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 27457ms instead of 10000ms, this is likely due to a long garbage
> collectin
> > g pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-07-31 16:43:23,362 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 34587ms instead of 10000ms, this is likely due to a long garbage
> collectin
> > g pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-07-31 16:43:23,367 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 78600ms instead of 60000ms, this is likely due to a long garbage
> collectin
> > g pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-07-31 16:43:23,369 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 78562ms instead of 60000ms, this is likely due to a long garbage
> collectin
> > g pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> >
> >
> > *Exception in Region log :
> > *
> > java.io.IOException: Reflection
> >     at org.apache.hadoop.hbase.
> >
> regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:304)
> >     at
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1375)
> >     at
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1319)
> >     at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1480)
> >     at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1271)
> >     at java.lang.Thread.run(Thread.java:662)
> > Caused by: java.lang.reflect.InvocationTargetException
> >     at sun.reflect.GeneratedMethodAccessor156.invoke(Unknown Source)
> >
> >     at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >     at java.lang.reflect.Method.invoke(Method.java:597)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:302)
> >     ... 5 more
> > Caused by: java.io.IOException: DFSOutputStream is closed
> >     at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3669)
> >     at
> > org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
> >     at
> > org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:995)
> >     ... 9 more
> > 2013-07-31 15:50:37,761 FATAL
> > org.apache.hadoop.hbase.regionserver.wal.HLog: Could not sync. Requesting
> > close of hlog
> >
> > 2013-07-31 19:23:38,289 INFO org.apache.hadoop.hdfs.DFSClient: Could not
> > complete file /hbase/.logs/hbase.xyz.com,60020,1370497170634/
> hbase.xyz.com
> > %2C60020%2C1370497170634.1375265949987
> > retrying...
> > 2013-07-31 19:23:38,289 INFO org.apache.hadoop.hdfs.DFSClient: Could not
> > complete file /hbase/.logs/hbase.xyz.com,60020,1370497170634/
> hbase.xyz.com
> > %2C60020%2C1370497170634.1375265949987
> > retrying...
> >
> >
> > *Exception in data node :-*
> >
> > 2013-07-31 19:39:51,094 WARN org.apache.hadoop.hdfs.server.
> > datanode.DataNode: org.apache.hadoop.ipc.RemoteException:
> > java.io.IOException: Got blockRec
> > eived message from unregistered or dead node
> blk_-4787262105551508952_28369
> >     at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188)
> >     at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069)
> >     at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
> >     at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >     at java.lang.reflect.Method.invoke(Method.java:597)
> >     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
> >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
> >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
> >     at java.security.AccessController.doPrivileged(Native Method)
> >     at javax.security.auth.Subject.doAs(Subject.java:396)
> >     at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> >     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
> >
> >     at org.apache.hadoop.ipc.Client.call(Client.java:1107)
> >     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> >     at sun.proxy.$Proxy5.blockReceived(Unknown Source)
> >     at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1006)
> >     at
> > org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527)
> >     at java.lang.Thread.run(Thread.java:662)
> >
> > *Exception in name node :- *
> >
> > 2013-07-31 19:39:50,671 WARN org.apache.hadoop.hdfs.
> > StateChange: BLOCK* NameSystem.blockReceived:
> > blk_-4787262105551508952_28369 is received from dead
> >  or unregistered node 192.168.20.30:50010
> > 2013-07-31 19:39:50,671 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:hadoop cause:java.io.IOException: Got blo
> > ckReceived message from unregistered or dead node
> > blk_-4787262105551508952_28369
> > 2013-07-31 19:39:50,671 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 5 on 9000, call blockReceived(DatanodeRegistration(
> > 192.168.20.30:50010,
> > storageID=DS-1816106352-192.168.20.30-50010-1369314076237,
> infoPort=50075,
> > ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@64f2d559,
> > [Ljava.l
> > ang.String;@294f9d6) from 192.168.20.30:59764: error:
> java.io.IOException:
> > Got blockReceived message from unregistered or dead node
> blk_-4787262105551
> > 508952_28369
> > java.io.IOException: Got blockReceived message from unregistered or dead
> > node blk_-4787262105551508952_28369
> >     at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188)
> >     at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069)
> >     at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
> >     at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >     at java.lang.reflect.Method.invoke(Method.java:597)
> >     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
> >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
> >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
> >     at java.security.AccessController.doPrivileged(Native Method)
> >     at javax.security.auth.Subject.doAs(Subject.java:396)
> >     at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> >     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
> >
> >
> >
> >
> >
> > --
> > Thanks and Regards,
> > Vimal Jain
> >
>



-- 
Thanks and Regards,
Vimal Jain

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message