hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haohui Mai (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-5722) Implement compression in the HTTP server of SNN / SBN instead of FSImage
Date Tue, 07 Jan 2014 00:16:51 GMT
Haohui Mai created HDFS-5722:

             Summary: Implement compression in the HTTP server of SNN / SBN instead of FSImage
                 Key: HDFS-5722
                 URL: https://issues.apache.org/jira/browse/HDFS-5722
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Haohui Mai

The current FSImage format support compression, there is a field in the header which specifies
the compression codec used to compress the data in the image. The main motivation was to reduce
the number of bytes to be transferred between SNN / SBN / NN.

The main disadvantage, however, is that it requires the client to access the FSImage in strictly
sequential order. This might not fit well with the new design of FSImage. For example, serializing
the data in protobuf allows the client to quickly skip data that it does not understand. The
compression built-in the format, however, complicates the calculation of offsets and lengths.
Recovering from a corrupted, compressed FSImage is also non-trivial as off-the-shelf tools
like bzip2recover is inapplicable.

This jira proposes to move the compression from the format of the FSImage to the transport
layer, namely, the HTTP server of SNN / SBN. This design simplifies the format of FSImage,
opens up the opportunity to quickly navigate through the FSImage, and eases the process of
recovery. It also retains the benefits of reducing the number of bytes to be transferred across
the wire since there are compression on the transport layer.

This message was sent by Atlassian JIRA

View raw message