hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: DFS too busy/down? while writing back to HDFS.
Date Tue, 06 Apr 2010 16:26:53 GMT
>From DataXceiver's javadoc

/**
 * Thread for processing incoming/outgoing data stream.
 */

So it's a bit different from the handlers AFAIK.

J-D

On Mon, Apr 5, 2010 at 10:57 PM, steven zhuang
<steven.zhuang.1984@gmail.com> wrote:
> than, J.D.
>          my cluster has the first problem. BTW, dfs.datanode.max.xcievers
> means the number of concurrent connections for a datanode right?
>
> On Tue, Apr 6, 2010 at 12:35 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> Look at your datanode logs around the same time. You probably either have
>> this
>>
>> http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A5
>>
>> or that
>>
>> http://wiki.apache.org/hadoop/Hbase/FAQ#A6
>>
>> Also you see to be putting a fair number of regions on those region
>> servers judging by the metrics, do consider setting HBASE_HEAP higher
>> than 1GB in conf/hbase-env.sh
>>
>> J-D
>>
>> On Mon, Apr 5, 2010 at 8:38 PM, steven zhuang
>> <steven.zhuang.1984@gmail.com> wrote:
>> > greetings,
>> >
>> >        while I was importing data into my HBase Cluster, I found one
>> > regionserver is down, and by check the log, I found following exceptoin:
>> > *EOFException*(during HBase flush memstore to HDFS file? not sure)
>> >
>> >        seems that it's caused by DFSClient not working, I don't know the
>> > exact reason, maybe it's caused by the heavy load on the machine where
>> the
>> > datanode is residing on, or the disk is full. but I am not sure which DFS
>> > node should I check.
>> >        has anybody met the same problem? any pointer or hint is
>> > appreciated.
>> >
>> >       The log is as follows:
>> >
>> >
>> > 2010-04-06 03:04:34,065 INFO
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> > Blocking updates for 'IPC Server handler 20 on 60020' on region
>> > hbt2table16,,1270522012397: memstore size 128.0m is >= than blocking
>> 128.0m
>> > size
>> > 2010-04-06 03:04:34,712 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Completed compaction of 34; new storefile is
>> > hdfs://rra-03:8887hbase/hbt2table16/2144402082/34/854678344516838047;
>> store
>> > size is 2.9m
>> > 2010-04-06 03:04:34,715 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Compaction size of 35: 2.9m; Skipped 0 file(s), size: 0
>> > 2010-04-06 03:04:34,715 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Started compaction of 5 file(s)  into
>> > hbase/hbt2table16/compaction.dir/2144402082, seqid=2914432737
>> > 2010-04-06 03:04:35,055 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Added
>> > hdfs://rra-03:8887hbase/hbt2table16/2144402082/184/1530971405029654438,
>> > entries=1489, sequenceid=2914917785, memsize=203.8k, filesize=88.6k to
>> > hbt2table16,,1270522012397
>> > 2010-04-06 03:04:35,442 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Completed compaction of 35; new storefile is
>> > hdfs://rra-03:8887hbase/hbt2table16/2144402082/35/2952180521700205032;
>> store
>> > size is 2.9m
>> > 2010-04-06 03:04:35,445 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Compaction size of 36: 2.9m; Skipped 0 file(s), size: 0
>> > 2010-04-06 03:04:35,445 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Started compaction of 4 file(s)  into
>> > hbase/hbt2table16/compaction.dir/2144402082, seqid=2914432737
>> > 2010-04-06 03:04:35,469 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Added
>> > hdfs://rra-03:8887hbase/hbt2table16/2144402082/185/1984548574711437130,
>> > entries=2105, sequenceid=2914917785, memsize=286.7k, filesize=123.9k to
>> > hbt2table16,,1270522012397
>> > 2010-04-06 03:04:35,711 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Added
>> > hdfs://rra-03:8887hbase/hbt2table16/2144402082/186/2470661482474884005,
>> > entries=3031, sequenceid=2914917785, memsize=414.0k, filesize=179.1k to
>> > hbt2table16,,1270522012397
>> > 2010-04-06 03:04:35,866 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > started.  Attempting to free 20853136 bytes
>> > 2010-04-06 03:04:37,010 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > completed. Freed 20866928 bytes.  Priority Sizes: Single=17.422821MB
>> > (18269152), Multi=150.70126MB (158021728),Memory=0.0MB (0)
>> > 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception
>> in
>> > createBlockOutputStream java.io.EOFException
>> > 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
>> > block blk_-6935524980745310745_1391901
>> > 2010-04-06 03:04:37,607 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Completed compaction of 36; new storefile is
>> > hdfs://rra-03:8887hbase/hbt2table16/2144402082/36/1570089400510240916;
>> store
>> > size is 2.9m
>> > 2010-04-06 03:04:37,612 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Compaction size of 37: 2.9m; Skipped 0 file(s), size: 0
>> > 2010-04-06 03:04:37,612 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>> > Started compaction of 4 file(s)  into
>> > hbase/hbt2table16/compaction.dir/2144402082, seqid=2914432737
>> > 2010-04-06 03:04:37,964 INFO org.apache.hadoop.hdfs.DFSClient: Exception
>> in
>> > createBlockOutputStream java.io.*EOFException*
>> > 2010-04-06 03:04:37,964 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
>> > block blk_2467598422201289982_1391902
>> > 2010-04-06 03:04:43,568 INFO org.apache.hadoop.hdfs.DFSClient: Exception
>> in
>> > createBlockOutputStream java.io.EOFException
>> > 2010-04-06 03:04:43,568 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
>> > block blk_-2065206049437531800_1391902
>> > 2010-04-06 03:04:44,044 INFO org.apache.hadoop.hdfs.DFSClient: Exception
>> in
>> > createBlockOutputStream java.io.EOFException
>> > 2010-04-06 03:04:44,044 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
>> > block blk_-3059563223628992257_1391902
>> > 2010-04-06 03:05:01,588 WARN org.apache.hadoop.hdfs.DFSClient:
>> DataStreamer
>> > Exception: java.io.IOException: Unable to create new block.
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2814)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2078)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2264)
>> >
>> > 2010-04-06 03:05:01,588 WARN org.apache.hadoop.hdfs.DFSClient: Error
>> > Recovery for block blk_-3320281088550177280_1391903 bad datanode[0] nodes
>> ==
>> > null
>> > 2010-04-06 03:05:01,589 WARN org.apache.hadoop.hdfs.DFSClient: Could not
>> get
>> > block locations. Source file
>> > "hbase/hbt2table16/2144402082/187/6358539637638901699" - Aborting...
>> > 2010-04-06 03:05:01,589 FATAL
>> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Replay of hlog
>> > required. Forcing server shutdown
>> > org.apache.hadoop.hbase.DroppedSnapshotException: region:
>> > hbt2table16,,1270522012397
>> >    at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:977)
>> >    at
>> > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:846)
>> >    at
>> >
>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:241)
>> >    at
>> >
>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:149)
>> > Caused by: java.io.EOFException
>> >    at java.io.DataInputStream.readByte(DataInputStream.java:250)
>> >    at
>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
>> >    at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
>> >    at org.apache.hadoop.io.Text.readString(Text.java:400)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2870)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2795)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2078)
>> >    at
>> >
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2264)
>> > 2010-04-06 03:05:01,603 INFO
>> > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
>> > request=0.0, regions=335, stores=590, storefiles=1231,
>> > storefileIndexSize=83, memstoreSize=128, compactionQueueSize=1,
>> > usedHeap=710, maxHeap=993, blockCacheSize=162178088,
>> > blockCacheFree=46200184, blockCacheCount=2483, blockCacheHitRatio=2,
>> > fsReadLatency=0, fsWriteLatency=0, fsSyncLatency=0
>> > 2010-04-06 03:05:01,604 INFO
>> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
>> > regionserver/10.76.112.214:60020.cacheFlusher exiting
>> > 2010-04-06 03:05:01,673 INFO org.apache.hadoop.hbase.regionserver.HLog:
>> Roll
>> > hbase/.logs/rrb-08.off.tn.ask.com
>> ,60020,1268973923999/hlog.dat.1270523052543,
>> > entries=483321, calcsize=88970157, filesize=61838598. New hlog
>> hbase/.logs/
>> > rrb-08.off.tn.ask.com,60020,126897392
>> >
>>
>

Mime
View raw message