Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4A689D8E9 for ; Wed, 24 Oct 2012 08:36:14 +0000 (UTC) Received: (qmail 78589 invoked by uid 500); 24 Oct 2012 08:36:14 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 78567 invoked by uid 500); 24 Oct 2012 08:36:14 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 78519 invoked by uid 99); 24 Oct 2012 08:36:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Oct 2012 08:36:12 +0000 Date: Wed, 24 Oct 2012 08:36:12 +0000 (UTC) From: "Yanbo Liang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1622805504.20902.1351067772977.JavaMail.jiratomcat@arcas> In-Reply-To: <792532207.24542.1333734924091.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3219) Disambiguate "visible length" in the code and docs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483080#comment-13483080 ] Yanbo Liang commented on HDFS-3219: ----------------------------------- @karlnicholas Thank you for your comments, the explanation is helpful. > Disambiguate "visible length" in the code and docs > -------------------------------------------------- > > Key: HDFS-3219 > URL: https://issues.apache.org/jira/browse/HDFS-3219 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Eli Collins > Priority: Minor > > HDFS-2288 there are two definition of visible length, or rather we're using the same name for two things: > 1. The HDFS-265 design doc which defines it as property of the replica: > {quote} > visible length is the "number of bytes that have been acknowledged by the downstream DataNodes". It is replica (not block) specific, meaning it can be different for different replicas at a given time. In the document it is called BA (bytes acknowledged), compared to BR (bytes received). > {quote} > 2. The definition in HDFS-814 and DFSClient#getVisibleLength which defines it as a property of a file: > {quote} > The visible length is the length that *all* datanodes in the pipeline contain at least such amount of data. Therefore, these data are visible to the readers. > {quote} > According to this definition the visible length of a file is the floor of all visible lengths of all the replicas of the last block. It's a static property set on open, eg is not updated when a writer calls hflush. Also DFSInputStream#readBlockLength returns the 1st visible length of a replica it finds, so it seems possible (though unlikely) in a failure scenario it could return a length that was longer than what all replicas had. > This has caused confusion in a number of other jiras. We should update the design doc, java doc, perhaps rename DFSClient#getVisibleLength etc to disambiguate this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira