hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1435) Provide an option to store fsimage compressed
Date Tue, 05 Oct 2010 04:49:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917876#action_12917876

dhruba borthakur commented on HDFS-1435:

+1 on compressing the entire file.

The VERSIONS file should have a entry of the form:
if the fsimage has been compressed using gzip.

1. At namenode startup time, it reads the VERSIONS file to determine how the fsimage is compressed.
If the VERSIONS file does not have a codec=xxx entry, then the NN assumes that the image is
not compressed.

2. while saving the fsimage, the NN looks at its own configuration to see if a config parameter
named io.compression.codec is defined in the config. If it is defined, then it uses that codec
to compress the fsimage and also updates the VERSIONS file.

This approach would be fully backward compatible and supports different compression algorithms
for fsimage.

> Provide an option to store fsimage compressed
> ---------------------------------------------
>                 Key: HDFS-1435
>                 URL: https://issues.apache.org/jira/browse/HDFS-1435
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.22.0
> Our HDFS has fsimage as big as 20G bytes. It consumes a lot of network bandwidth when
secondary NN uploads a new fsimage to primary NN.
> If we could store fsimage compressed, the problem could be greatly alleviated.
> I plan to provide a new configuration hdfs.image.compressed with a default value of false.
If it is set to be true, fsimage is stored as compressed.
> The fsimage will have a new layout with a new field "compressed" in its header, indicating
if the namespace is stored compressed or not.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message