Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8604C200C23 for ; Wed, 22 Feb 2017 20:10:48 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 84A47160B62; Wed, 22 Feb 2017 19:10:48 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C736A160B49 for ; Wed, 22 Feb 2017 20:10:47 +0100 (CET) Received: (qmail 74940 invoked by uid 500); 22 Feb 2017 19:10:46 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 74929 invoked by uid 99); 22 Feb 2017 19:10:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Feb 2017 19:10:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 54D50C004B for ; Wed, 22 Feb 2017 19:10:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.799 X-Spam-Level: * X-Spam-Status: No, score=1.799 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id T0oLf3dNUuAd for ; Wed, 22 Feb 2017 19:10:45 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 197D35F2C5 for ; Wed, 22 Feb 2017 19:10:45 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5DE0BE02F1 for ; Wed, 22 Feb 2017 19:10:44 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 162C724121 for ; Wed, 22 Feb 2017 19:10:44 +0000 (UTC) Date: Wed, 22 Feb 2017 19:10:44 +0000 (UTC) From: "Ravi Prakash (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-11435) NameNode should track open for write files lengths more frequent than on newer block allocations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 22 Feb 2017 19:10:48 -0000 [ https://issues.apache.org/jira/browse/HDFS-11435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878998#comment-15878998 ] Ravi Prakash commented on HDFS-11435: ------------------------------------- Thanks Manoj! I'll continue the discussion on HDFS-11402. Thanks for volunteering to fix that btw. Its really important > NameNode should track open for write files lengths more frequent than on newer block allocations > ------------------------------------------------------------------------------------------------ > > Key: HDFS-11435 > URL: https://issues.apache.org/jira/browse/HDFS-11435 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Manoj Govindassamy > Assignee: Manoj Govindassamy > > *Problem:* > Currently the length of an open for write / Under construction file is updated on the NameNode only when > # Block boundary: On block boundaries and upon allocation of new Block, NameNode gets to know the file growth and the file length catches up > # hsync(SyncFlag.UPDATE_LENGTH): Upon Client apps invoking a hsync on the write stream with a special flag, DataNodes send an incremental block report with the latest file length which NameNode uses it to update its meta data. > # First hflush() on the new Block: Upon Client apps doing first time hflush() on an every new Block, DataNodes notifies NameNode about the latest file length. > # Output stream close: Forces DataNodes update NameNode about the file length after data persistence and proper acknowledgements in the pipeline. > So, lengths for open for write files are usually a lot less than the length seen by the DN/client. Highly preferred to have NameNode not lagging in file lengths by order of Block size for under construction files and to have more frequent, scalable update mechanism for these open file lengths. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org