hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Time Less <timelessn...@gmail.com>
Subject Re: HDFS Corruption: How to Troubleshoot or Determine Root Cause?
Date Wed, 18 May 2011 01:45:21 GMT
The answer is dfs.data.dir wasn't defined, and indeed the data was being
stored in /tmp. Corruption ensues. I've found a page:
http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html that seems
to have a good number of the parameters that should be defined.


On Tue, May 17, 2011 at 5:16 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> Hey Tim,
>
> It looks like you are running with only 1 replica so my first guess is
> that you only have 1 datanode and it's writing to /tmp, which was
> cleaned at some point.
>
> J-D
>
> On Tue, May 17, 2011 at 5:13 PM, Time Less <timelessness@gmail.com> wrote:
> > I loaded data into HDFS last week, and this morning I was greeted with
> this
> > on the web interface: "WARNING : There are about 32 missing blocks.
> Please
> > check the log or run fsck."
> >
> > I ran fsck and see several missing and corrupt blocks. The output is
> > verbose, so here's a small sample:
> >
> >
> /tmp/hadoop-mapred/mapred/staging/hdfs/.staging/job_201104081532_0507/job.jar:
> > CORRUPT block blk_-5745991833770623132
> >
> /tmp/hadoop-mapred/mapred/staging/hdfs/.staging/job_201104081532_0507/job.jar:
> > MISSING 1 blocks of total size 2945889 B........
> > /user/hive/warehouse/player_game_stat/2011-01-15/datafile: CORRUPT block
> > blk_1642129438978395720
> > /user/hive/warehouse/player_game_stat/2011-01-15/datafile: MISSING 1
> blocks
> > of total size 67108864 B................
> >
> > Sometimes the number of dots after the B is quite large (several lines
> > long). Some of these are tmp files, but many are important. If this
> cluster
> > were prod, I'd have some splaining to do. I need to determine what caused
> > this corruption.
> >
> > Questions:
> >
> > What are the dots after the B? What is the significance of the number of
> > them?
> > Does anyone have suggestions where to start?
> > Are there typical misconfigurations or issues that cause corruption &
> > missing files?
> > What is "the log" that the NameNode web interface is refers to?
> >
> > Thanks for any infos! I'm... nervous. :)
> > --
> > Tim Ellis
> > Riot Games
> >
> >
>



-- 
Tim

Mime
View raw message