hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Prakash (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-6489) DFS Used space is not correct computed on frequent append operations
Date Tue, 12 Sep 2017 05:51:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155977#comment-16155977
] 

Ravi Prakash edited comment on HDFS-6489 at 9/12/17 5:50 AM:
-------------------------------------------------------------

Thanks for your reply [~brahmareddy]! Sorry about the tangent on {{FsDatasetImpl#removeOldReplica}}
. I'm afraid I'm also not sure you are the point person on this. Could you please redirect
me to the right person if you're not?

Let's focus on the {{HDFS6489.java}} test in written and reported by Bogdan. I see that it
still fails on trunk. Here's the output
{code}
$ java HDFS6489
doing small appends...
17/09/06 13:20:25 INFO hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073741835_1057
java.io.EOFException: Unexpected EOF while trying to read response from server
	at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:444)
	at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1750)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1495)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
Exception in thread "main" java.io.IOException: All datanodes [DatanodeInfoWithStorage[127.0.0.1:9866,DS-af60f3f1-eb86-46c2-821a-8d2f1dcb339d,DISK]]
are bad. Aborting...
	at org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1549)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1483)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
{code}
Why do you think that is?

Where is the code you posted last? I wasn't able to find it in trunk or branch-2


was (Author: raviprak):
Thanks for your reply Brahma! Sorry about the tangent on {{FsDatasetImpl#removeOldReplica}}
. I'm afraid I'm also not sure you are the point person on this. Could you please redirect
me to the right person if you're not?

Let's focus on the {{HDFS6489.java}} test in written and reported by Bogdan. I see that it
still fails on trunk. Here's the output
{code}
$ java HDFS6489
doing small appends...
17/09/06 13:20:25 INFO hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073741835_1057
java.io.EOFException: Unexpected EOF while trying to read response from server
	at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:444)
	at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1750)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1495)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
Exception in thread "main" java.io.IOException: All datanodes [DatanodeInfoWithStorage[127.0.0.1:9866,DS-af60f3f1-eb86-46c2-821a-8d2f1dcb339d,DISK]]
are bad. Aborting...
	at org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1549)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1483)
	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
{code}
Why do you think that is?

Where is the code you posted last? I wasn't able to find it in trunk or branch-2

> DFS Used space is not correct computed on frequent append operations
> --------------------------------------------------------------------
>
>                 Key: HDFS-6489
>                 URL: https://issues.apache.org/jira/browse/HDFS-6489
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.2.0, 2.7.1, 2.7.2
>            Reporter: stanley shi
>         Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, HDFS-6489.003.patch, HDFS-6489.004.patch,
HDFS-6489.005.patch, HDFS-6489.006.patch, HDFS-6489.007.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space on each block
write operation. This is correct in most scenario (create new file), but sometimes it will
behave in-correct(append small data to a large block).
> For example, I have a file with only one block(say, 60M). Then I try to append to it
very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the block(60M), not
teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large number of
files (1000+), assume the block size is 32M (half of the default value), then the dfs used
will be increased 1000*32M = 32G on each append to the files; but actually I only write 10K
bytes; this will cause the datanode to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
opWriteBlock  BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received exception
org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Insufficient space for appending
to FinalizedReplica, blk_1073742834_45306, FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda3              16G  2.9G   13G  20% /
> tmpfs                 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1              97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message