hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Exceptions in log files of Hadoop and Hbase
Date Tue, 13 Aug 2013 15:42:06 GMT
Hi Vimal,

4GB for all the process is very short... You might want to run in
standalone mode instead of pseudo-distributed. That will save you some
memory. Have you checked if you server is swapping? That will make is slow,
and then you will miss some heartbits and processes will close...

JM

2013/8/13 Vimal Jain <vkjk89@gmail.com>

> Hi Jean-Marc,
> Thanks for your reply.
> I have one node cluster(pseudo distributed mode) , so 192.168.20.30 is the
> only server which hosts all 6 processes (
> namenode,datanode,secondarynamenode,Hmaster,HRegion and Zookeeper).
>
> At time of this problem, i had given following memory to these processes (
> I had total 4 GB memory on this box):
>
> NameNode,DataNode,SecondaryNamenode - 1 GB ( default)
> HMaster,Zookeeper - 512MB
> HRegion-1.5 GB
>
> So total memory allocation is 5.5 GB and it exceeds physical memory.( I am
> planning to increase RAM on my system to 8GB and change memory allocation
> as 2 GB to HRegion and 1 GB to other 5 ).
> So could that be a problem ?
> I am talking about memory stuff as i see following in Hmaster log.
>
> *org.apache.hadoop.hbase.util.Sleeper: We slept 1266874891120ms instead of
> 10000ms, this is likely due to a long garbage collecting pause and it's
> usually bad*
> *
> *
>
>
>
> On Tue, Aug 13, 2013 at 6:42 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > Hi Vimal,
> >
> > What was your cluster doing at that time? Was it very busy? Looks like
> one
> > server (192.168.20.30 <http://192.168.20.30:50010>) went so busy that it
> > failed to report active and closed.
> >
> > JM
> >
> > 2013/8/13 Vimal Jain <vkjk89@gmail.com>
> >
> > > Hi,
> > > I am using Hadoop and Hbase in pseudo distributed mode.
> > > I am using Hadoop version - 1.1.2  and Hbase version - 0.94.7
> > >
> > > Recently i found some exception in hadoop and hbase logs.
> > > I am not sure what has caused this.
> > > Requesting you to please help here.
> > >
> > >
> > > *Exception in Master log :*
> > > 2013-07-31 15:51:04,694 WARN org.apache.hadoop.hbase.util.
> > > Sleeper: We slept 1266874891120ms instead of 10000ms, this is likely
> due
> > to
> > > a long garbage c
> > > ollecting pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > > 2013-07-31 15:51:04,798 WARN
> > org.apache.hadoop.hbase.master.CatalogJanitor:
> > > Failed scan of catalog table
> > > org.apache.hadoop.hbase.client.ScannerTimeoutException: 82253ms passed
> > > since the last invocation, timeout is currently set to 60000
> > >     at
> > >
> org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:283)
> > >     at
> > >
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:727)
> > >     at
> > >
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:184)
> > >     at
> > >
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:169)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:123)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:134)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:92)
> > >     at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
> > >     at java.lang.Thread.run(Thread.java:662)
> > > Caused by: org.apache.hadoop.hbase.UnknownScannerException:
> > > org.apache.hadoop.hbase.UnknownScannerException: Name:
> > -8839286818925700393
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2544)
> > >     at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
> > >
> > >     at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> > >
> > >     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > > Method)
> > >     at
> > >
> > >
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> > >     at
> > >
> > >
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> > >     at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:143)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:42)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:164)
> > >     at
> > >
> org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:274)
> > >     ... 8 more
> > > 2013-07-31 15:54:42,526 DEBUG
> > org.apache.hadoop.hbase.client.ClientScanner:
> > > Creating scanner over .META. starting at key ''
> > > 2013-07-31 15:54:42,526 DEBUG
> > org.apache.hadoop.hbase.client.ClientScanner:
> > > Advancing internal scanner to startKey at ''
> > > 2013-07-31 15:54:42,531 DEBUG
> > org.apache.hadoop.hbase.client.ClientScanner:
> > > Finished with scanning at {NAME => '.META.,,1', STARTKEY => '', ENDKEY
> =>
> > > '', ENCODED => 1028785192,}
> > > 2013-07-31 15:54:42,532 DEBUG
> > > org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 5 catalog row(s)
> > and
> > > gc'd 0 unreferenced parent region(s)
> > > 2013-07-31 15:54:42,751 INFO
> org.apache.hadoop.hbase.master.LoadBalancer:
> > > Skipping load balancing because balanced cluster; servers=1 regions=5
> > > averag
> > > e=5.0 mostloaded=5 leastloaded=5
> > >
> > > 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept
> > > 25771ms instead of 1000ms, this is likely due to a long garbage
> > collecting
> > >  pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > > 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept
> > > 30091ms instead of 1000ms, this is likely due to a long garbage
> > collecting
> > >  pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > > 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept
> > > 28613ms instead of 10000ms, this is likely due to a long garbage
> > collectin
> > > g pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > > 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept
> > > 27457ms instead of 10000ms, this is likely due to a long garbage
> > collectin
> > > g pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > > 2013-07-31 16:43:23,362 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept
> > > 34587ms instead of 10000ms, this is likely due to a long garbage
> > collectin
> > > g pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > > 2013-07-31 16:43:23,367 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept
> > > 78600ms instead of 60000ms, this is likely due to a long garbage
> > collectin
> > > g pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > > 2013-07-31 16:43:23,369 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept
> > > 78562ms instead of 60000ms, this is likely due to a long garbage
> > collectin
> > > g pause and it's usually bad, see
> > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > >
> > >
> > > *Exception in Region log :
> > > *
> > > java.io.IOException: Reflection
> > >     at org.apache.hadoop.hbase.
> > >
> >
> regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:304)
> > >     at
> > org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1375)
> > >     at
> > org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1319)
> > >     at
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1480)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1271)
> > >     at java.lang.Thread.run(Thread.java:662)
> > > Caused by: java.lang.reflect.InvocationTargetException
> > >     at sun.reflect.GeneratedMethodAccessor156.invoke(Unknown Source)
> > >
> > >     at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:302)
> > >     ... 5 more
> > > Caused by: java.io.IOException: DFSOutputStream is closed
> > >     at
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3669)
> > >     at
> > >
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
> > >     at
> > > org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:995)
> > >     ... 9 more
> > > 2013-07-31 15:50:37,761 FATAL
> > > org.apache.hadoop.hbase.regionserver.wal.HLog: Could not sync.
> Requesting
> > > close of hlog
> > >
> > > 2013-07-31 19:23:38,289 INFO org.apache.hadoop.hdfs.DFSClient: Could
> not
> > > complete file /hbase/.logs/hbase.xyz.com,60020,1370497170634/
> > hbase.xyz.com
> > > %2C60020%2C1370497170634.1375265949987
> > > retrying...
> > > 2013-07-31 19:23:38,289 INFO org.apache.hadoop.hdfs.DFSClient: Could
> not
> > > complete file /hbase/.logs/hbase.xyz.com,60020,1370497170634/
> > hbase.xyz.com
> > > %2C60020%2C1370497170634.1375265949987
> > > retrying...
> > >
> > >
> > > *Exception in data node :-*
> > >
> > > 2013-07-31 19:39:51,094 WARN org.apache.hadoop.hdfs.server.
> > > datanode.DataNode: org.apache.hadoop.ipc.RemoteException:
> > > java.io.IOException: Got blockRec
> > > eived message from unregistered or dead node
> > blk_-4787262105551508952_28369
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069)
> > >     at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
> > >     at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
> > >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
> > >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
> > >     at java.security.AccessController.doPrivileged(Native Method)
> > >     at javax.security.auth.Subject.doAs(Subject.java:396)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> > >     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
> > >
> > >     at org.apache.hadoop.ipc.Client.call(Client.java:1107)
> > >     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> > >     at sun.proxy.$Proxy5.blockReceived(Unknown Source)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1006)
> > >     at
> > > org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527)
> > >     at java.lang.Thread.run(Thread.java:662)
> > >
> > > *Exception in name node :- *
> > >
> > > 2013-07-31 19:39:50,671 WARN org.apache.hadoop.hdfs.
> > > StateChange: BLOCK* NameSystem.blockReceived:
> > > blk_-4787262105551508952_28369 is received from dead
> > >  or unregistered node 192.168.20.30:50010
> > > 2013-07-31 19:39:50,671 ERROR
> > > org.apache.hadoop.security.UserGroupInformation:
> > PriviledgedActionException
> > > as:hadoop cause:java.io.IOException: Got blo
> > > ckReceived message from unregistered or dead node
> > > blk_-4787262105551508952_28369
> > > 2013-07-31 19:39:50,671 INFO org.apache.hadoop.ipc.Server: IPC Server
> > > handler 5 on 9000, call blockReceived(DatanodeRegistration(
> > > 192.168.20.30:50010,
> > > storageID=DS-1816106352-192.168.20.30-50010-1369314076237,
> > infoPort=50075,
> > > ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@64f2d559,
> > > [Ljava.l
> > > ang.String;@294f9d6) from 192.168.20.30:59764: error:
> > java.io.IOException:
> > > Got blockReceived message from unregistered or dead node
> > blk_-4787262105551
> > > 508952_28369
> > > java.io.IOException: Got blockReceived message from unregistered or
> dead
> > > node blk_-4787262105551508952_28369
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069)
> > >     at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
> > >     at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
> > >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
> > >     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
> > >     at java.security.AccessController.doPrivileged(Native Method)
> > >     at javax.security.auth.Subject.doAs(Subject.java:396)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> > >     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Thanks and Regards,
> > > Vimal Jain
> > >
> >
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message