Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A48FF191B9 for ; Mon, 21 Mar 2016 20:55:26 +0000 (UTC) Received: (qmail 13201 invoked by uid 500); 21 Mar 2016 20:55:26 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 13056 invoked by uid 500); 21 Mar 2016 20:55:26 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 12620 invoked by uid 99); 21 Mar 2016 20:55:26 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Mar 2016 20:55:26 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D37752C1F7B for ; Mon, 21 Mar 2016 20:55:25 +0000 (UTC) Date: Mon, 21 Mar 2016 20:55:25 +0000 (UTC) From: "Andrew Wang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205111#comment-15205111 ] Andrew Wang commented on HDFS-3702: ----------------------------------- [~stack] from a downstream perspective, could you comment on the usability of providing node lists to this API? This always felt hacky to me, since ultimately the NN is the one who knows about the cluster state and DN names and block location constraints. My impression was that tracking this in HBase was onerous, and is part of why favored nodes fell out of favor. bq. For example, some application may want to distribute its files uniformly in a cluster.... The main reason for skew I've seen is the local writer case, which this patch attempts to address. It'll still bias to the local rack, but I doubt that'll be an issue in practice, and if it is we can also add another flag for fully random distribution. > Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client > ----------------------------------------------------------------------------------------------------- > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Affects Versions: 2.5.1 > Reporter: Nicolas Liochon > Assignee: Lei (Eddy) Xu > Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, HDFS-3702.008.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)