hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dlmar...@comcast.net
Subject Re: HDFS 2.6.0 upgrade ends with missing blocks
Date Thu, 08 Jan 2015 20:48:12 GMT
Colin, 

Thanks for the response, understanding the details is important and I think some general guidelines
would be great. Since my initial email the system administrators told me that the drives are
not actually full; the filesystems by default keep 5% in reserve. We can lower the reserve
by 1%, which frees up about 30GB on a 3TB drive. We have been able to do just that, restart
the DNs, and then everything reported in. At this point I can run the balancer and try to
even things out across the DNs. Running the balancer before the upgrade was not really an
option since in CDH 5.1.2 it did not work due to HDFS-6621. As a side note, I did apply the
fix from HDFS-6621 and the balancer still didn't work (or I did someting wrong). 

I think adding some properties for the upgrade would be a good idea. I would rather see an
upgrade fail due to "not enough space" than have an upgrade appear to succeed and then end
up with missing blocks. From a user perspective, the upgrade just failed in a bad way. Regarding
existing failed drives, maybe the upgrade instructions should have the system administrators
unmount them first. 

Dave Marion 

----- Original Message -----

From: "Colin P. McCabe" <cmccabe@apache.org> 
To: hdfs-dev@hadoop.apache.org 
Sent: Thursday, January 8, 2015 2:43:29 PM 
Subject: Re: HDFS 2.6.0 upgrade ends with missing blocks 

Hi dlmarion, 

In general, any upgrade process we do will consume disk space, because 
it's creating hardlinks and a new "current" directory, and so forth. 
So upgrading when disk space is very low is a bad idea in any 
scenario. It's certainly a good idea to free up some space before 
doing the upgrade. We should put a note about this somewhere. We 
should also add this check into various Hadoop management software 
packages. 

Unfortunately, I can't tell you how much space you need exactly to do 
an upgrade. As you guessed, it depends on the local UNIX filesystem 
you are using. It also depends on how you formatted the filesystem. 
If you are using ext4, this parameter is controlled by the 
bytes-per-inode of the filesystem, multiplied by the number of inodes. 
ext4 may also run out of inodes even though there is free space on the 
disk (but this is very unlikely when using HDFS with reasonably large 
block sizes.) What I would say is that you need to have at least as 
many free inodes as blocks, and also a few kilobytes of space for 
writing some new VERSION files and other metadata files. In general, 
I would say leave a generous number of megabytes free to avoid 
problems with things like the block scanner filling up your disk 
unexpectedly, or the rebalancer filling up your disk during an MR job 
or other job. 

You mentioned that the DataNode continued to start even though some of 
the volumes couldn't be upgraded, due to 
dfs.datanode.failed.volumes.tolerated being non-zero. That is 
definitely unfortunate. Maybe we could add another configuration 
named dfs.datanode.failed.upgrade.volumes.tolerated, that took effect 
when upgrading. That way you could specify that any failures were not 
acceptable during an upgrade... i.e., they caused the upgrade to fail. 
Would this be helpful? The main downside that I can see for this-- 
and it is a big one-- is that if there are any existing failed drives, 
then the upgrade would not succeed. I think this would be scary for a 
lot of administrators. 

best, 
Colin 
Cloudera 


On Wed, Jan 7, 2015 at 5:25 AM, <dlmarion@comcast.net> wrote: 
> 
> I recently upgraded from CDH 5.1.2 to CDH 5.3.0. I know, contact Cloudera, but this is
actually a generic issue. After the upgrade I brought up the DNs and after all of them had
checked in I ended up with missing blocks. I tracked this down in the DN logs to an error
at startup where the DN is failing to create subdirectories. This happens at BlockPoolSliceStorage.doUpgrade().
It appears that the directory structure has changed with HDFS-6482 and the DN is pre-creating
all of the directories at DN startup time. If the disk is near full, then it fails to create
the subdirectories because it consumes the remaining space. If the hdfs configuration allows
failed drives (dfs.datanode.failed.volumes.tolerated > 0), then the DN will start without
the now full disk and report all of the blocks except the ones on the full disk. 
> 
> I didn't find any type of warning in the Apache release notes. It might be useful for
people in a similar situation. For the Cloudera folks on this list, there is no warning or
note in your upgrade instructions that I could find either. 
> 
> Some questions: 
> 
> 1. How much free space is needed per disk to pre-create the directory structure. Is it
dependent on the type of filesystem? I calculated 256MB given my reading of the ticket, but
I may have misunderstood something. 
> 
> 2. Now that block locations are calculated using the block id, are there restrictions
on where blocks can be placed? I assume that the location is not verified on a read for backwards
compatibility, if that is not true, then someone needs to comment on HDFS-1312 that the older
utilities cannot be used. I need to move blocks from the full disks to other locations, I'm
looking for any restrictions in doing that. 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message