hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haohui Mai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5783) Compute the digest before loading FSImage
Date Thu, 16 Jan 2014 00:54:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872890#comment-13872890

Haohui Mai commented on HDFS-5783:

A preliminary experiment shows that there is little impact of loading time. I load a 512M
fsimage on my laptop, here is the number:

# Loading the FSImage, computing the digest with {{DigestInputStream}}: 9920ms
# Loading the FSImage, without computing the digest: 7467ms
# Calculating MD5 independently: 1231ms

The reason why (2) + (3) is slightly faster than (1) is because currently we cannot consume
all I/O bandwidth when loading fsimage.

> Compute the digest before loading FSImage
> -----------------------------------------
>                 Key: HDFS-5783
>                 URL: https://issues.apache.org/jira/browse/HDFS-5783
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-5698 (FSImage in protobuf)
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
> When loading the fsimage, the current code computes its MD5 digest on-the-fly. It does
not work when the code does not read all the sections in strictly sequential order.
> This jira proposes to compute the MD5 digest before loading fsimage.

This message was sent by Atlassian JIRA

View raw message