hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Misha Dmitriev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory
Date Tue, 06 Feb 2018 06:12:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353441#comment-16353441
] 

Misha Dmitriev commented on HDFS-12051:
---------------------------------------

I tested my change in a relatively small cluster that simulates some customer's workload at
a smaller scale. It contains about 15M HDFS files, NN heap oscillates between about 2.1 and
2.9GiB. Before the change, FSImage load time is 48 sec, and there was the following overhead
due to duplicate byte[] arrays:
||  *Overhead* ||  *# objects* ||  *Unique objects* || *Class name* ||
| 336,794K (14.8%)| 11,309,068| 3,132,447|byte[]|

 

I first applied the current patch, where the NameCache size is set as 1/512th of the total
heap size. It resulted in a cache with 1M entries, taking 4MiB of memory. It turns out that
when the cache size is considerably smaller than the number of unique arrays (3M in the above
case), it leaves behind some duplicate objects. We saved ~100MiB of memory:
||  *Overhead* ||  *# objects* ||  *Unique objects* || *Class name* ||
| 232,435K (10.7%)| 8,599,197| 3,132,367|byte[]|

 

Next, I changed my patch so that by default NameCache is set as 1/400 of the total heap size.
This resulted in a cache with 2M entries, and it eliminated almost all duplicate objects and
saved, compared to the original NameNode, more than 260MiB of memory. The FSImage load time
was 50 sec.
||  *Overhead* ||  *# objects* ||  *Unique objects* || *Class name* ||
| 76,209K (3.8%)| 4,866,933| 3,132,450|byte[]|

> Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting
file/directory names) to save memory
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-12051
>                 URL: https://issues.apache.org/jira/browse/HDFS-12051
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Misha Dmitriev
>            Assignee: Misha Dmitriev
>            Priority: Major
>         Attachments: HDFS-12051-NameCache-Rewrite.pdf, HDFS-12051.01.patch, HDFS-12051.02.patch,
HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, HDFS-12051.06.patch, HDFS-12051.07.patch,
HDFS-12051.08.patch, HDFS-12051.09.patch
>
>
> When snapshot diff operation is performed in a NameNode that manages several million
HDFS files/directories, NN needs a lot of memory. Analyzing one heap dump with jxray (www.jxray.com),
we observed that duplicate byte[] arrays result in 6.5% memory overhead, and most of these
arrays are referenced by {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}}
and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}:
> {code:java}
> 19. DUPLICATE PRIMITIVE ARRAYS
> Types of duplicate objects:
>      Ovhd         Num objs  Num unique objs   Class name
> 3,220,272K (6.5%)   104749528      25760871         byte[]
> ....
>   1,841,485K (3.7%), 53194037 dup arrays (13158094 unique)
> 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 of byte[8](48,
48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48,
...), 237395 of byte[8](48, 48, 48, 48, 48, 49, 95, 48), 227853 of byte[17](112, 97, 114,
116, 45, 109, 45, 48, 48, 48, ...), 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48,
48, 48, ...), 169487 of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97,
114, 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, 95, 48),
108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...)
> ... and 45902395 more arrays, of which 13158084 are unique
>      <-- org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name
<-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode <--  {j.u.ArrayList}
<-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs
<-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.features
<-- org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 elements)
... <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks
<-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
<-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java Static:
org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER
>   409,830K (0.8%), 13482787 dup arrays (13260241 unique)
> 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of byte[32](116,
97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of byte[32](116, 97, 115, 107, 95, 49, 52,
57, 55, 48, ...), 350 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of
byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of byte[32](116, 97, 115, 107,
95, 49, 52, 57, 55, 48, ...), 341 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...),
340 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of byte[32](116, 97,
115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of byte[32](116, 97, 115, 107, 95, 49, 52, 57,
55, 48, ...)
> ... and 13479257 more arrays, of which 13260231 are unique
>      <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc
<-- org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries
<-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap
<-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
<-- j.l.Thread[] <-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries
<-- org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap
<-- org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
<-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java Static:
org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER
> ....
> {code}
> There are several other places in NameNode code which also produce duplicate {{byte[]}}
arrays.
> To eliminate this duplication and reclaim memory, we will need to write a small class
similar to StringInterner, but designed specifically for byte[] arrays.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message