hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cagdas Gerede" <cagdas.ger...@gmail.com>
Subject HDFS: Good practices for Number of Blocks per Datanode
Date Fri, 02 May 2008 17:37:16 GMT
As an addition to my question at the bottom,

I was wondering what would be your suggestion in terms of how many blocks a
datanode should be responsible for?
For a system with 60 million blocks, we can have 3 datanodes with 20 million
blocks each, or we can have 60 datanodes with 1 million blocks each. In
either case, would there be performance implications or would they behave
the same way?

I guess in general what I would like to ask, as you need more and more
storage, do you think we should add new datanodes in the system or we should
add more harddisk space to existing datanodes?

I appreciate your comments,

On Fri, May 2, 2008 at 10:25 AM, Cagdas Gerede <cagdas.gerede@gmail.com>

> In the system I am working, we have 6 million blocks total and the
> namenode heap size is about 600 MB and it takes about 5 minutes for namenode
> to leave the safemode.
> I try to estimate what would be the heap size if we have 100 - 150 million
> blocks, and what would be the amount of time for namenode to leave the
> safemode.
> From the extrapolation based on the numbers I have, I am calculating very
> scary numbers for both (Terabytes for heap size) and half an hour or so
> startup time. I am hoping that my extrapolation is not accurate.
> From your clusters, could you provide some numbers for number of files and
> blocks in the system vs. the master heap size and master startup time.
> I really appreciate your help.
> Thanks.
> Cagdas
> --
> ------------
> Best Regards, Cagdas Evren Gerede
> Home Page: http://cagdasgerede.info

Best Regards, Cagdas Evren Gerede
Home Page: http://cagdasgerede.info

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message