hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sreenivasulu y <sreenivasul...@huawei.com>
Subject RE: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log
Date Mon, 25 Aug 2014 11:23:03 GMT
Hai ram,

The rest of the machines are machine2 and machine4 are running fine.
I am performing set of operations as part of that I deleted that table,
Here my doubt is machine1 regionserver went down even though I am performing normal operations.

Might your assumed scenario, may be correct.
Then this is an issue rite?

Regards
seenu

-----Original Message-----
From: ramkrishna vasudevan [mailto:ramkrishna.s.vasudevan@gmail.com] 
Sent: 25 August 2014 PM 03:15
To: user@hbase.apache.org
Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log
entry to log

Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/
df2dae34231829adc6ac10b43f5decb2/recovered.edits

Did the other two machines machine 2 and machine 4 up and running? Were you able to get back
the table and the number of records you have inserted?
Seems to me that after the recovery was successful the recovered.edits dir for that region
was deleted by Log splitting thread and other machine tried to use that path and it failed.

Regards
Ram



On Mon, Aug 25, 2014 at 12:22 PM, sreenivasulu y <sreenivasulu.y@huawei.com>
wrote:

> Thanks for reply ted
> I am using following versions
> HBase 0.98.3
> HDFS 2.4.3
>
> Do you see other exceptions in machine1 server log ?
> The following file is not exist like that throwing error.
>
> 2014-07-11 08:23:58,921 WARN  [Thread-39475] hdfs.DFSClient: 
> DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> No lease on
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2772)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2680)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy18.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
>         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at $Proxy19.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
>         at $Proxy20.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStrea
> m.java:525)
> 2014-07-11 08:23:58,933 FATAL
> [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> Got while writing log entry to log
> java.io.IOException: cannot get log writer
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run
> (HLogSplitter.java:813) Caused by: java.io.FileNotFoundException: 
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2/recovered.edits
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyParentDir(FSNamesystem.java:2156)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2289)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2237)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2190)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:520)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:354)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1604)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1425)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:437)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:433)
>         at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.createNonRecursive(DistributedFileSystem.java:433)
>         at
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1110)
>         at
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1086)
>         at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:78)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:194)
>         ... 8 more
>
> In NameNode at the same time deleted the file.
>
> 2014-07-11 08:23:57,983 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848427_107656 10.18.40.69:50076 10.18.40.89:50076
> 2014-07-11 08:23:58,786 INFO org.apache.hadoop.hdfs.StateChange: 
> BLOCK*
> allocateBlock:
> /hbase/data/default/2025Thread0_table/6cd0083d54dbd8a8b010c00a2770a321/recovered.edits/0000000000000000002.temp.
> BP-89134156-10.18.40.69-1405003788606
> blk_1073849564_108794{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-de56b04c-75dc-4884-aede-22
> 12c1a1305e:NORMAL|RBW], 
> ReplicaUnderConstruction[[DISK]DS-6a01309e-a4f2-4ad2-a100-38ae0c947427
> :NORMAL|RBW]]}
> 2014-07-11 08:23:58,826 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848423_107652 10.18.40.89:50076 10.18.40.69:50076
> 2014-07-11 08:23:58,832 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 8 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589066 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/7d5032128770926448aa7e02c3cf963e/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,833 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 5 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589067 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2
> /recovered.edits
> 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 6 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589068 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/f17021778ea258ffb5edde2d86b0237f
> /recovered.edits
> 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 1 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589069 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/046da18c6ddeae7df459c42daf711c78
> /recovered.edits
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 5 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589072 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/995a56512784d109352d9dcbbeeebfbc/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 8 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589070 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 6 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589071 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/6b8b92d73b51e1fd245854e62111fc8c/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,857 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 1 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589073 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/591d084674854bb913f419046e7f7d0c/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,960 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 8 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589074 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/5c514b93ed15b38cdc210abed58907df/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,966 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 7 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589076 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/b6d129691540af012a388590bf1e8629/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 3 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589077 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/190414760587ee6adbbb518e7e0c719f/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,996 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073849517_108746 10.18.40.69:50076 10.18.40.89:50076
> 2014-07-11 08:23:59,014 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848446_107675 10.18.40.89:50076 10.18.40.69:50076
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: 25 August 2014 AM 11:03
> To: user@hbase.apache.org
> Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter: 
> Got while writing log entry to log
>
> What release of HBase are you using ?
>
> Do you see other exceptions in machine1 server log ?
>
> Please check namenode log as well.
>
> Cheers
>
>
> On Sun, Aug 24, 2014 at 6:37 PM, sreenivasulu y 
> <sreenivasulu.y@huawei.com
> >
> wrote:
>
> > Hi,
> >
> > I am running a cluster with 4 node and  HDFS in HA mode.
> > Machine1, machine2, machine3 and machine4.
> > And I am performing the following operations respectively.
> > 1. create table with precreated regions.
> > 2. insert 1000 records into the table.
> > 3. Disable the table.
> > 4. Modify the table.
> > 5. enable the table.
> > 6. disable and delete the table.
> >
> > While performing the above operations step3 and step4, machine3 
> > regionserver went down.
> > But in machine1 thrown the following error
> > 2014-07-11 08:23:58,933 FATAL
> > [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> > Got while writing log entry to log
> > java.io.IOException: cannot get log writer
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLog
> Factory.java:197)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEd
> itsWriter(HLogFactory.java:182)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLo
> gSplitter.java:643)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> sOutputSink.createWAP(HLogSplitter.java:1223)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> sOutputSink.getWriterAndPath(HLogSplitter.java:1200)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> sOutputSink.append(HLogSplitter.java:1243)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.wri
> teBuffer(HLogSplitter.java:851)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doR
> un(HLogSplitter.java:843)
> >         at
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.r
> > un
> > (HLogSplitter.java:813)
> > and machine 1 also down..
> >
> > please help me why the machine1 regionserver went down.
> >
> >
> >
> >
> > --------------------------------------------------------------------
> > --
> > ---------------------------------------------------------------
> > This e-mail and its attachments contain confidential information 
> > from HUAWEI, which is intended only for the person or entity whose 
> > address is listed above. Any use of the information contained herein 
> > in any way (including, but not limited to, total or partial 
> > disclosure, reproduction, or dissemination) by persons other than 
> > the intended
> > recipient(s) is prohibited. If you receive this e-mail in error, 
> > please notify the sender by phone or email immediately and delete it!
> >
> >
>
Mime
View raw message