hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evert Lammerts <Evert.Lamme...@sara.nl>
Subject Stability issue - dead DN's
Date Wed, 11 May 2011 09:23:25 GMT
Hi list,

I notice that whenever our Hadoop installation is put under a heavy load we lose one or two
(on a total of five) datanodes. This results in IOExceptions, and affects the overall performance
of the job being run. Can anybody give me advise or best practices on a different configuration
to increase the stability? Below I've included the specs of the cluster, the hadoop related
config and an example of when which things go wrong. Any help is very much appreciated, and
if I can provide any other info please let me know.

Cheers,
Evert

== What goes wrong, and when ==

See attached a screenshot of Ganglia when the cluster is under load of a single job. This
job:
* reads ~1TB from HDFS
* writes ~200GB to HDFS
* runs 288 Mappers and 35 Reducers

When the job runs it takes all available Map and Reduce slots. The system starts swapping
and there is a short time interval during which most cores are in WAIT. After that the job
really starts running. At around half way, one or two datanodes become unreachable and are
marked as dead nodes. The amount of under-replicated blocks becomes huge. Then some "java.io.IOException:
Could not obtain block" are thrown in Mappers. The job does manage to finish successfully
after around 3.5 hours, but my fear is that when we make the input much larger - which we
want - the system becomes too unstable to finish the job.

Maybe worth mentioning - never know what might help diagnostics.  We notice that memory usage
becomes less when we switch our keys from Text to LongWritable. Also, the Mappers are done
in a fraction of the time. However, this for some reason results in much more network traffic
and makes Reducers extremely slow. We're working on figuring out what causes this.


== The cluster ==

We have a cluster that consists of 6 Sun Thumpers running Hadoop 0.20.2 on CentOS 5.5. One
of them acts as NN and JT, the other 5 run DN's and TT's. Each node has:
* 16GB RAM
* 32GB swapspace
* 4 cores
* 11 LVM's of 4 x 500GB disks (2TB in total) for HDFS
* non-HDFS stuff on separate disks
* a 2x1GE bonded network interface for interconnects
* a 2x1GE bonded network interface for external access

I realize that this is not a well balanced system, but it's what we had available for a prototype
environment. We're working on putting together a specification for a much larger production
environment.


== Hadoop config ==

Here some properties that I think might be relevant:

__CORE-SITE.XML__

fs.inmemory.size.mb: 200
mapreduce.task.io.sort.factor: 100
mapreduce.task.io.sort.mb: 200
# 1024*1024*4 MB, blocksize of the LVM's
io.file.buffer.size: 4194304

__HDFS-SITE.XML__

# 1024*1024*4*32 MB, 32 times the blocksize of the LVM's
dfs.block.size: 134217728
# Only 5 DN's, but this shouldn't hurt
dfs.namenode.handler.count: 40
# This got rid of the occasional "Could not obtain block"'s
dfs.datanode.max.xcievers: 4096

__MAPRED-SITE.XML__

mapred.tasktracker.map.tasks.maximum: 4
mapred.tasktracker.reduce.tasks.maximum: 4
mapred.child.java.opts: -Xmx2560m
mapreduce.reduce.shuffle.parallelcopies: 20
mapreduce.map.java.opts: -Xmx512m
mapreduce.reduce.java.opts: -Xmx512m
# Compression codecs are configured and seem to work fine
mapred.compress.map.output: true
mapred.map.output.compression.codec: com.hadoop.compression.lzo.LzoCodec


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message