hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tao tony <tonytao0...@outlook.com>
Subject Re: namenode and datanode "Block Pool Used" abnormal growth
Date Mon, 04 Jun 2018 01:35:48 GMT
hi Kihwal,

Thanks for your kindly replying.

I saw there were only 6 files for that table as below.

[hdfs@master ~]$ hadoop fs -ls   /hawq_data/16385/16519/31957
Found 6 items
-rw-------   2 gpadmin gpadmin          0 2018-05-30 18:37 /hawq_data/16385/16519/31957/1
-rw-------   2 gpadmin gpadmin          0 2018-05-30 18:37 /hawq_data/16385/16519/31957/2
-rw-------   2 gpadmin gpadmin          0 2018-05-30 18:37 /hawq_data/16385/16519/31957/3
-rw-------   2 gpadmin gpadmin          0 2018-05-30 18:37 /hawq_data/16385/16519/31957/4
-rw-------   2 gpadmin gpadmin          0 2018-05-30 18:37 /hawq_data/16385/16519/31957/5
-rw-------   2 gpadmin gpadmin          0 2018-05-30 18:37 /hawq_data/16385/16519/31957/6

How could the "block pool used" grow about 100GB when I  write to these 6 file frequently?

Thank you again,kihwal!

I'll set the block size to 64MB.

Tao Jin

On 06/01/2018 09:18 PM, Kihwal Lee wrote:
That's because the files were still open. You get billed for the entire block until the file
is closed (block is finalized).
As an experiment, try reducing "dfs.blocksize" by half.

Kihwal

On Fri, Jun 1, 2018 at 12:56 AM, tao tony <tonytao0505@outlook.com<mailto:tonytao0505@outlook.com>>
wrote:
hi ,


I used Apache HAWQ to write data on HDFS-2.7.3,and met a strange problem.

I had totally wirte 300MB data,commit 100 times,each time commit 3MB.But
each node "block pool used"  increased by more than 30GB,"block pool
used"  in namenode increased 100GB.But when I use "hadoop fs -du -h
/",the space only grow 300MB.And there's no change with block numbers.
If i continually commit small data, "block pool used" will become
greater then 100% and returned no space left.

After about several minutes,the "block pool used" will gradually
decrease to the normal.

I didn't  see any logs on namenode and  datanode to reclaim the "block
pool used".

Anyone could explain why it happend and how Could I solve this problem.Many thanks!


Tao Jin



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message