hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: TestDU failures on common RedHat distributions
Date Fri, 19 Nov 2010 03:27:59 GMT
Hey Kirby,

I filed HADOOP-7045. Apologies for the slow reply!

Thanks,
Eli

On Tue, Sep 8, 2009 at 4:50 PM, Kirby Bohling <kirby.bohling@gmail.com> wrote:
> All,
>
>   I was trying to get Hadoop compiling and passing unit tests.  I
> have having problems with the TestDU test.  I have searched the issue
> tracker and the web.  I haven't seen much googling around for issues.
>
> From the TEST-org.apache.hadoop.fs.TestDU.txt:
>
> Testcase: testDU took 5.147 sec
>    FAILED
> expected:<32768> but was:<36864>
> junit.framework.AssertionFailedError: expected:<32768> but was:<36864>
>    at org.apache.hadoop.fs.TestDU.testDU(TestDU.java:79)
>
> Source version:
> https://svn.apache.org/repos/asf/hadoop/common/trunk@812652
>
> I found the following issues, but none of them covered the problems I am seeing:
> http://issues.apache.org/jira/browse/HADOOP-2813
> http://issues.apache.org/jira/browse/HADOOP-2845
> http://issues.apache.org/jira/browse/HADOOP-2927
>
> I also saw this e-mail, which I think is misdiagnosing the issue:
> http://markmail.org/message/tlmb63rgootn4ays
>
> I think Konstantin has it exactly backwards.  I see this behavior
> exactly the same on both a updated CentOS 5.3, and a Fedora 11
> install, but tmpfs is doing what Hadoop expects, ext3 is causing unit
> tests failures.  A file that is exactly 32K (as created by the unit
> test), is 36K according to "du -sk" on an ext3 partition.  On a tmpfs
> (in-memory partition), it is 32K as one would "expect".
>
> If I run: ant test -Dtest.build.data=/tmp/hadoop/test, and the du
> calls happen on a tmpfs filesystem, all of the tests pass.  If I run
> "ant test", and the data files end up inside of
> "/home/kbohling/hadoop/build/tmp/data/dutmp/", the unit tests fail.
>
> Running all of these on my Fedora 11 machine:
>
> $ ls -l /tmp/data /home/shared/data
> -rw-rw-r--. 1 kbohling shared         32768 2009-09-08 15:34 /home/shared/data
> -rw-rw-r--. 1 kbohling kbohling 32768 2009-09-08 15:33 /tmp/data
>
> $ stat /tmp/data /home/shared/data  | egrep "Blocks|File"
>  File: `/tmp/data'
>  Size: 32768           Blocks: 64         IO Block: 4096   regular file
>  File: `/home/shared/data'
>  Size: 32768           Blocks: 72         IO Block: 4096   regular file
>
> NOTE: the 64 blocks corresponds to a 32K file, the 72 blocks
> corresponds to a 36K file.
>
> $ df -h /tmp /home/shared
> Filesystem            Size  Used Avail Use% Mounted on
> tmpfs                 2.0G  1.9M  2.0G   1% /tmp
> /dev/mapper/stdfs-home
>                       19G   17G  1.4G  93% /home
>
> $ cat /proc/mounts | egrep "/home|/tmp"
> /dev/mapper/stdfs-home /home ext3
> rw,relatime,errors=continue,user_xattr,acl,data=ordered 0 0
> tmpfs /tmp tmpfs rw,rootcontext=system_u:object_r:tmp_t:s0,relatime 0 0
>
> # dumpe2fs -h /dev/mapper/stdfs-home | grep "Block size"
> dumpe2fs 1.41.4 (27-Jan-2009)
> Block size:               4096
>
> # dumpe2fs -h /dev/mapper/stdfs-home | grep "Filesystem features"
> dumpe2fs 1.41.4 (27-Jan-2009)
> Filesystem features:      has_journal ext_attr resize_inode dir_index
> filetype needs_recovery sparse_super large_file
>
>
> Running just the ext3 filesystem on a CentOS 5.3 machine (the tmpfs is
> all the same as Fedora):
>
> # ls -l /root/data
> -rw-r--r-- 1 root root 32768 Sep  8 15:54 /root/data
>
> # stat /root/data | egrep "File|Blocks"
>  File: `/root/data'
>  Size: 32768           Blocks: 72         IO Block: 4096   regular file
>
> # cat /proc/mounts | grep "root"
> rootfs / rootfs rw 0 0
> /dev/root / ext3 rw,data=ordered 0 0
>
> # dumpe2fs -h /dev/root  | grep "Block size"
> dumpe2fs 1.39 (29-May-2006)
> Block size:               4096
>
> # dumpe2fs -h /dev/root  | grep "Filesystem features"
> dumpe2fs 1.39 (29-May-2006)
> Filesystem features:      has_journal ext_attr resize_inode dir_index
> filetype needs_recovery sparse_super large_file
>
> I'm not sure why this is happening on RedHat's varients of Linux (I
> don't have an Ubuntu, SuSE, etc, etc machine handy otherwise I'd check
> them also).  It definitely appears that the ext3 filesystem kernel
> calls are returning that there are 36K blocks, not 32K blocks.
>
> Okay, a bit more checking, running all of this on Fedora to try a file
> system without extended attributes:
> # lvcreate --name foo --size 1G /dev/stdfs
>
> # mke2fs -j -O ^ext_attr /dev/stdfs/foo
>
> # mount /dev/stdfs/foo /mnt
>
> # cp /home/shared/data /mnt/.
>
> # stat /mnt/data | egrep "File|Blocks"
>  File: `/mnt/data'
>  Size: 32768           Blocks: 64         IO Block: 4096   regular file
>
> NOTE: so now the same file that on /home/shared/ is 36K is now 32K.
>
> So it looks like Hadoop's unit tests will fail if they run on any ext3
> filesystem that supports "ext_attr".  It might be nice to note that in
> a comment in the code, if it can't be detected at runtime during the
> tests.  Sure looks like you could create a 32k, 64k, and 128k file and
> deem it "acceptable" if they are all off by exactly 4k or something.
> So if they don't match exactly, but they are all off by exactly one
> block size, that might be okay.
>
> I can try and boil this down and file a JIRA issue if that is appropriate.
>
> Thanks,
>    Kirby
>

Mime
View raw message