hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Beyer (JIRA)" <j...@apache.org>
Subject [jira] [Reopened] (HDFS-196) File length not reported correctly after application crash
Date Tue, 09 Jun 2015 21:44:02 GMT

     [ https://issues.apache.org/jira/browse/HDFS-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Beyer reopened HDFS-196:
------------------------------

HDFS file length reported by ls may be less than the number of bytes found when reading. 
I created the mismatched file by kill -9 during a copy so that the client doesn't shutdown
its connection to the namenode properly.  This misreported length persisted after restarting
hdfs.

{quote}
$ hdfs dfs -copyFromLocal junk17 /tmp/.
2015-06-09 13:09:25,742 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
- Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable
^Z
[1]+  Stopped                 hdfs dfs -copyFromLocal junk17 /tmp/.
$ kill -9 %1

[1]+  Stopped                 hdfs dfs -copyFromLocal junk17 /tmp/.
$ fg
-bash: fg: job has terminated
[1]+  Killed: 9               hdfs dfs -copyFromLocal junk17 /tmp/.
$ hdfs dfs -ls /tmp
2015-06-09 13:09:45,730 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
- Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable
Found 3 items
drwxrwx---   - jane supergroup          0 2015-05-28 14:26 /tmp/hadoop-yarn
drwx-wx-wx   - jane supergroup          0 2015-05-28 14:26 /tmp/hive
-rw-r--r--   1 jane supergroup 1073741824 2015-06-09 13:09 /tmp/junk17._COPYING_
$ hdfs dfs -ls /tmp
2015-06-09 13:09:55,345 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
- Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable
Found 3 items
drwxrwx---   - jane supergroup          0 2015-05-28 14:26 /tmp/hadoop-yarn
drwx-wx-wx   - jane supergroup          0 2015-05-28 14:26 /tmp/hive
-rw-r--r--   1 jane supergroup 1073741824 2015-06-09 13:09 /tmp/junk17._COPYING_
$ hdfs dfs -cat /tmp/junk17._COPYING_ | wc -c
 1207959752
$ hdfs dfs -ls /tmp
2015-06-09 13:11:21,389 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
- Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable
Found 3 items
drwxrwx---   - jane supergroup          0 2015-05-28 14:26 /tmp/hadoop-yarn
drwx-wx-wx   - jane supergroup          0 2015-05-28 14:26 /tmp/hive
-rw-r--r--   1 jane supergroup 1073741824 2015-06-09 13:09 /tmp/junk17._COPYING_
$ hdfs dfs -cp /tmp/junk17._COPYING_ /tmp/junk18
2015-06-09 13:13:38,963 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
- Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable
$ hdfs dfs -ls /tmp
2015-06-09 13:13:45,575 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
- Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable
Found 4 items
drwxrwx---   - jane supergroup          0 2015-05-28 14:26 /tmp/hadoop-yarn
drwx-wx-wx   - jane supergroup          0 2015-05-28 14:26 /tmp/hive
-rw-r--r--   1 jane supergroup 1073741824 2015-06-09 13:09 /tmp/junk17._COPYING_
-rw-r--r--   1 jane supergroup 1207959552 2015-06-09 13:13 /tmp/junk18
{quote}

{quote}
$ hdfs version
Hadoop 2.6.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1
Compiled by jenkins on 2014-11-13T21:10Z
Compiled with protoc 2.5.0
>From source with checksum 18e43357c8f927c0695f1e9522859d6a
{quote}

> File length not reported correctly after application crash
> ----------------------------------------------------------
>
>                 Key: HDFS-196
>                 URL: https://issues.apache.org/jira/browse/HDFS-196
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Doug Judd
>
> Our application (Hypertable) creates a transaction log in HDFS.  This log is written
with the following pattern:
> out_stream.write(header, 0, 7);
> out_stream.sync()
> out_stream.write(data, 0, amount);
> out_stream.sync()
> [...]
> However, if the application crashes and then comes back up again, the following statement
> length = mFilesystem.getFileStatus(new Path(fileName)).getLen();
> returns the wrong length.  Apparently this is because this method fetches length information
from the NameNode which is stale.  Ideally, a call to getFileStatus() would return the accurate
file length by fetching the size of the last block from the primary datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message