hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-8253) A corrupted log blocked ReplicationSource
Date Tue, 08 Sep 2015 04:48:46 GMT

     [ https://issues.apache.org/jira/browse/HBASE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell resolved HBASE-8253.
-----------------------------------
    Resolution: Incomplete
      Assignee:     (was: Jieshan Bean)

Marking as incomplete.

Let me note I've seen this issue in production. The replication queue will stall until retries
are exhausted, but then the corrupt hlog will be dropped from the queue and queue processing
will resume.

> A corrupted log blocked ReplicationSource
> -----------------------------------------
>
>                 Key: HBASE-8253
>                 URL: https://issues.apache.org/jira/browse/HBASE-8253
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.6
>            Reporter: Jieshan Bean
>         Attachments: HBASE-8253-94.patch
>
>
> A writting log got corrupted when we forcely power down one node. Only partial of last
WALEdit was written into that log. And that log was not the last one in replication queue.

> ReplicationSource was blocked under this scenario. A lot of logs like below were printed:
> {noformat}
> 2013-03-30 06:53:48,628 WARN  [regionserver26003-EventThread.replicationSource,1] 1 Got:
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:334)
> java.io.EOFException: hdfs://hacluster/hbase/.logs/master11,26003,1364530862620/master11%2C26003%2C1364530862620.1364553936510,
entryStart=40434738, pos=40450048, end=40450048, edit=0
> 	at sun.reflect.GeneratedConstructorAccessor42.newInstance(Unknown Source)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> 	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.addFileInfoToException(SequenceFileLogReader.java:295)
> 	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:240)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:84)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:412)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:330)
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:68)
> 	at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:106)
> 	at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2282)
> 	at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2181)
> 	at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2227)
> 	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:238)
> 	... 3 more
> ..........	
> 2013-03-30 06:54:38,899 WARN  [regionserver26003-EventThread.replicationSource,1] 1 Got:
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:334)
> java.io.EOFException: hdfs://hacluster/hbase/.logs/master11,26003,1364530862620/master11%2C26003%2C1364530862620.1364553936510,
entryStart=40434738, pos=40450048, end=40450048, edit=0
> 	at sun.reflect.GeneratedConstructorAccessor42.newInstance(Unknown Source)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> 	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.addFileInfoToException(SequenceFileLogReader.java:295)
> 	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:240)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:84)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:412)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:330)
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:68)
> 	at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:106)
> 	at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2282)
> 	at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2181)
> 	at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2227)
> 	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:238)
> 	... 3 more
> ...........	
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message