hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tang" <shawndow...@gmail.com>
Subject MapReduce job temp input files
Date Wed, 29 Oct 2014 05:11:21 GMT

We are running mapreduce jobs on hadoop clusters. The job inputs come from logs which are
not in HDFS. so we need first copy them into HDFS. After job finished, delete them. 
Recently the cluster become very unstable. the HDFS disk are prone to full. in fact total
valid files are only several Gega bytes. many invalid blocks are on the disk. After reboot
the cluster,

they are deleted automatically. It seems that restart datanode only willn't work, the namenode
willn't send delete block command to datanode. 

for this case, any ideas?

View raw message