Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Tue, 4 Mar 2014 08:52:21 +0000 (UTC)
From: "Vinayakumar B (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12554862.1336711540226.8927.1393923141317@arcas>
In-Reply-To: <JIRA.12554862.1336711540226@arcas>
References: <JIRA.12554862.1336711540226@arcas>
Subject: [jira] [Updated] (HDFS-3405) Checkpointing should use HTTP POST or
 PUT instead of GET-GET to send merged fsimages
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinayakumar B updated HDFS-3405:
--------------------------------

    Attachment: HDFS-3405.patch

Hi All,
Sorry for the late response here. 

Changes in the latest patch.
1.   Verified uploading of big image files and Confirmed that *timeout (dfs.image.transfer.timeout)* set while uploading the file is just a SocketTimeout not the entire transfer timeout. This confirmed by trying to upload 1GB sized image file with only 5 sec timeout. It was successfull even though total upload in very slow n/w took ~5 min. So we can reduce default value of *dfs.image.transfer.timeout* to 60 second which is default socket timeout in hadoop.

2.  I was facing OOME while uploading 2GB sized files due to internal buffering of HTTP streaming. Increased the Heapsizes upto 15GB, still it was taking lot of time
    So, used {{connection.setChunkedStreamingMode(64 * 1024);}} with one extra parameter {{File-Length}} instead of {{Content-Length}} to indicate the file length for verification. Since {{Content-Length}} is a integer, for more than 2GB sized files this will not be correct.
After that verified upload of 2GB+ sized files successfully.

3. Updated all previous comments.


> Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-3405
>                 URL: https://issues.apache.org/jira/browse/HDFS-3405
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha
>            Reporter: Aaron T. Myers
>            Assignee: Vinayakumar B
>         Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch
>
>
> As Todd points out in [this comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986], the current scheme for a checkpointing daemon to upload a merged fsimage file to an NN is to issue an HTTP get request to tell the target NN to issue another GET request back to the checkpointing daemon to retrieve the merged fsimage file. There's no fundamental reason the checkpointing daemon can't just use an HTTP POST or PUT to send back the merged fsimage file, rather than the double-GET scheme.


--
This message was sent by Atlassian JIRA
(v6.2#6252)