hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1435) Provide an option to store fsimage compressed
Date Thu, 14 Oct 2010 18:18:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921060#action_12921060
] 

Hairong Kuang commented on HDFS-1435:
-------------------------------------

I did experiments with a secondary namenode using our internal 0.20 branch. I used LzoCodec
to compress the image. Here are the results:

||uncompressed||LZO compressed||
|image size|13G|2.9G| 
|loading image from disk|5 mins|8 mins| 
|save image to disk|2 mins|4.5 mins| 
|download image from primary NN|16.5 mins|6.5 mins| 
|upload image to primary NN|16.5 mins|6.5 mins|
|whole checkpoint|40 mins|25 mins|

The result shows that a compressed image greatly improves image downloading and uploading
overhead although it adds 5.5 minutes overhead to loading/saving the image. Overall this gives
us 15 minutes reduction for checkpointing a 13G image. 

As Lu pointed out, another very obvious optimization we could easily do is not to download
the image from the primary NameNode if the secondary has the same one. This will in addition
give us 6.5 minute reduction. 

> Provide an option to store fsimage compressed
> ---------------------------------------------
>
>                 Key: HDFS-1435
>                 URL: https://issues.apache.org/jira/browse/HDFS-1435
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.22.0
>
>         Attachments: checkpoint-limitandcompress.patch, trunkImageCompress.patch, trunkImageCompress1.patch
>
>
> Our HDFS has fsimage as big as 20G bytes. It consumes a lot of network bandwidth when
secondary NN uploads a new fsimage to primary NN.
> If we could store fsimage compressed, the problem could be greatly alleviated.
> I plan to provide a new configuration hdfs.image.compressed with a default value of false.
If it is set to be true, fsimage is stored as compressed.
> The fsimage will have a new layout with a new field "compressed" in its header, indicating
if the namespace is stored compressed or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message