hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "M. Aaron Bossert" <maboss...@gmail.com>
Subject Re: org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append sequenceId=8689
Date Thu, 06 Jul 2017 21:24:32 GMT
I can't be definitive, but I have had a very similar issue in the past.  The root cause was
the my NTP server had died and a couple of nodes in the cluster got wildly out of sync.  Check
your HDFS health And if there are under-replicated blocks...this "could" be your issue (though
root cause could be bad disks or any number of other issues that present with the same "symptoms",
but again, I would take this advice only as far as needed to either rule it out or dig further...don't
go down a rabbit hole.  Your errors could have been caused by an entirely different problem...I
have no other context from the error you provided to know where else to look...

Aaron

Sent from my iPhone

> On Jul 6, 2017, at 16:55, Ted Yu <yuzhihong@gmail.com> wrote:
> 
> HBASE-16960 mentioned the following :
> 
> Caused by: java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel
to be ready for read
> 
> Do you see similar line in region server log ?
> 
> Cheers
> 
>> On Thu, Jul 6, 2017 at 1:48 PM, anil gupta <anilgupta84@gmail.com> wrote:
>> Hi All,
>> 
>> We are running HBase/Phoenix on EMR5.2(HBase1.2.3 and Phoenix4.7) and we running
into following exception when we are trying to load data into one of our Phoenix table:
>> 2017-07-06 19:57:57,507 INFO [hconnection-0x60e5272-shared--pool2-t249] org.apache.hadoop.hbase.client.AsyncProcess:
#1, table=DE.CONFIG_DATA, attempt=30/35 failed=38ops, last exception: org.apache.hadoop.hbase.regionserver.wal.DamagedWALException:
org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append sequenceId=8689, requesting
roll of WAL
>> 	at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1921)
>> 	at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1773)
>> 	at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1695)
>> 	at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> 	at java.lang.Thread.run(Thread.java:745)
>> 
>> We are OK with wiping out this table and rebuilding the dataset. We tried to drop
the table and recreate the table but it didnt fix it. 
>> Can anyone please let us know how can we get rid of above problem? Are we running
into https://issues.apache.org/jira/browse/HBASE-16960?
>> 
>> -- 
>> Thanks & Regards,
>> Anil Gupta
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message