hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-4979) Implement retry cache on the namenode
Date Wed, 24 Jul 2013 22:53:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718938#comment-13718938
] 

Suresh Srinivas edited comment on HDFS-4979 at 7/24/13 10:52 PM:
-----------------------------------------------------------------

ClientCacheEntry/ClientCacheEntryWithPayload cost:
* Object cost - 16 bytes
* state - 1 byte
* clientId - 16 bytes
* callId - 4 bytes
* next elemet - 8 bytes
* Payload cost - assume on an average 24 bytes
* PriorityQueue expiration time - 8 bytes
* PriorityQueue entry reference - 8 bytes
*= 85 bytes*

Assuming 150 bytes for files and blocks, a file costs ~300 bytes of namenode heap space. 
> The total heap space for with F number of file is 300 * F. Lets generously assume that
this uses up the entire namenode java heap space.
>> Considering number of retry cache entries for a period of 10 minutes is 1% of number
of files count, number of retry cache entries is ~0.01 * F.
>> The heap space for retry cache including cached response is 0.01 * F * (85+8) bytes
 = ~F bytes. This is 0.33% of total java heap space.
>> The heap space that retry cache alone should be sized on: (8 bytes per entry * 0.33
* total heap space)/85 = ~0.028% of total heap space

I plan on using 0.03% of total heap space for retry cache map. This would be 0.345% of java
heap space for retry cache + cached responses.

Lets look at some examples of how this number works out:
||Heap space||Num of cache entries||Total memory for retry cache||Ops/second (sustained for
10 minutes)||
|1M|64|6.25K|1 ops for every 9 seconds|
|256M|16384|1.6M|27|
|1G|65536|6.4M|109|
|64G|4194304|409.6M|6992|

Note in the above calculation, number of cache entries are the nearest power of 2 number.

The number of entries from the table will be the max limit for number of entries in the retry
cache.

Does this calculation look correct? Does this amount of heap sound reasonable?

                
      was (Author: sureshms):
    ClientCacheEntry/ClientCacheEntryWithPayload cost:
* Object cost - 16 bytes
* state - 1 byte
* clientId - 16 bytes
* callId - 4 bytes
* next elemet - 8 bytes
* Payload cost - assume on an average 24 bytes
* PriorityQueue expiration time - 8 bytes
* PriorityQueue entry reference - 8 bytes
*= 85 bytes*

Assuming 150 bytes for files and blocks, a file costs ~300 bytes of namenode heap space. 
> The total heap space for with F number of file is 300 * F. Lets generously assume that
this uses up the entire namenode java heap space.
>> Considering number of retry cache entries for a period of 10 minutes is 1% of number
of files count, number of retry cache entries is ~0.01 * F.
>> The heap space for retry cache including cached response is 0.01 * F * (85+8) bytes
 = ~F bytes. This is 0.33% of total java heap space.
>> The heap space that retry cache alone should be sized on: (8 bytes per entry * 0.33
* total heap space)/85 = ~0.028% of total heap space

I plan on using 0.03% of total heap space for retry cache map. This would be 0.345% of java
heap space for retry cache + cached responses.

Lets look at some examples of how this number works out:
||Heap space||Num of cache entries||Total memory for retry cache||Ops/second (sustained for
10 minutes)||
|1M|64|6.25K|1 ops for every 9 seconds|
|256M|16384|1.6M|27|
|1G|65536|6.4M|109|
|64G|4194304|409.6M|6992|

Note in the above calculation, number of cache entries are the nearest power of 2 number.

The number of entries from the table will the max limit for retry cache.

Does this calculation look correct? Does this amount of heap sound reasonable?

                  
> Implement retry cache on the namenode
> -------------------------------------
>
>                 Key: HDFS-4979
>                 URL: https://issues.apache.org/jira/browse/HDFS-4979
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: HDFS-4979.10.patch, HDFS-4979.1.patch, HDFS-4979.2.patch, HDFS-4979.3.patch,
HDFS-4979.4.patch, HDFS-4979.5.patch, HDFS-4979.6.patch, HDFS-4979.7.patch, HDFS-4979.8.patch,
HDFS-4979.9.patch, HDFS-4979.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message