hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Jindal <capri.himan...@gmail.com>
Subject Scalability of Name Node for external clients
Date Thu, 02 Jul 2015 09:18:18 GMT
I have a question regarding scalability of name node. Typically the name
node handles 2 type of clients:
1. Internal clients (data nodes - part of the hadoop cluster)
2. External clients (client nodes requesting for block locations in order
to perform read/writes on data nodes)

I am not much concerned about the throughput of Internal clients, However I
am more worried about throughput of the external clients. So what is
expected throughput of operations on name-node for external clients and how
scalable it is? To be more precise, Please look at following example:

There is a typical Name Node server running a cluster of 100 data nodes.
Now assuming the Internal clients use default block reports and heartbeat
requests, I have following questions regarding scalability of the NameNode:
1. What is number of simultaneous external clients connections the Name
Node can support? (a hundred thousands?)
2. What is the number of operations (get block locations) served per second?
3. What are different ways to increase the throughput for these external
clients?

Thanks
Himanshu

Mime
View raw message