hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: region servers shutdown
Date Thu, 10 Feb 2011 18:56:15 GMT
The first thing to do would be to look at the datanode logs a the time
of the outage. Very often it's caused by either ulimit or xcievers
that weren't properly configured, checkout
http://hbase.apache.org/notsoquick.html#ulimit

J-D

On Thu, Feb 10, 2011 at 10:42 AM, Venkatesh <vramanathan00@aol.com> wrote:
>
>
>
>  Hi
> I've had this before ..but not to 70% of the cluster..region servers all dying..Any insight
is helpful.
> Using hbase-0.20.6, hadoop-0.20.2
> I don't see any error in the datanode or the namenode
> many thanks
>
>
> Here's the relevant log entires
>
> ..in master...
> Got while writing region XXXXXXlog java.io.IOException: Bad connect ack with firstBadLink
YYYYYYY
>
> 2011-02-10 01:31:26,052 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Waiting for
hlog writers to terminate, iteration #9
> 2011-02-10 01:31:28,974 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
>
> 2011-02-10 01:31:28,975 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_1053173551314261780_21097871 bad datanode[2] nodes == null
> 2011-02-10 01:31:28,975 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/hbase_data/AAAA/1560386868/oldlogfile.log" - Aborting...
>
>
> in region server..(one of them)
>
> 2011-02-10 01:29:41,028 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
>
> 2011-02-10 01:29:41,028 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_2549916783344080232_21096412 bad datanode[0] nodes == null
> 2011-02-10 01:29:41,029 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/hbase_data/user_activity/1710495506/activities/8613593457794008999" - Aborting...
> 2011-02-10 01:29:41,029 FATAL org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Replay
of hlog required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: AAAAAAAA,1297217998178
>        at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1041)
>        at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:896)
>        at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:258)
>        at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:231)
>        at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:154)
> Caused by: java.io.EOFException
>        at java.io.DataInputStream.readByte(DataInputStream.java:250)
>        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
>        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
>        at org.apache.hadoop.io.Text.readString(Text.java:400)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2901)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
>
>
>
>
>
>

Mime
View raw message