Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 39E6F1034B for ; Tue, 28 Jan 2014 20:02:25 +0000 (UTC) Received: (qmail 46690 invoked by uid 500); 28 Jan 2014 20:02:16 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 46520 invoked by uid 500); 28 Jan 2014 20:02:16 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 46512 invoked by uid 99); 28 Jan 2014 20:02:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jan 2014 20:02:16 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of suresh@hortonworks.com designates 209.85.216.173 as permitted sender) Received: from [209.85.216.173] (HELO mail-qc0-f173.google.com) (209.85.216.173) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jan 2014 20:02:11 +0000 Received: by mail-qc0-f173.google.com with SMTP id i8so1295830qcq.4 for ; Tue, 28 Jan 2014 12:01:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=1YdWXLXNXpVVntfF4zbK8a5jOF/G6sHzJbOZlRbxblo=; b=YalvodiY/C1zxhH6yL6SBT1zZN5OnMVvgtKAMQtOMQu/IHiet6Nj1gJ9y9hwljOlDH 0dxYUBC0tNnqvGAhg09dlqitq03OA3HzdbEyIN8qe4W3uO8VxJJxN0Fd04UOhVCmIsmw GLpm6qXajh+87Gb851CCckTyCHOmFS/9s8OD7NndyOFNiztWMXV/TiD1gz6ACR3linU6 rEWUhYm/hqjbO11OhuWfTDgsegm6q9+siL8uP1KgI/Lsow6Br2QPRxEL4P1ECEu1A3Ty Hf14me6y94Nz0T13JFN2jx6CX9podxCaeb7W5MrCXhdCK3oNCjY9Sp+uoQhcWdobdPbl 8eJg== X-Gm-Message-State: ALoCoQnrPe7bnjIcf13LMgvdCStHDB0zSwUXyrRXY7xd5BhBtwxN8KHOegeB4RXhHrHYiUCb8jcU9V+5ojGyskBBvBf71b5FsRdl3c4D/GSrCZXr8439p/I= MIME-Version: 1.0 X-Received: by 10.140.86.116 with SMTP id o107mr5213444qgd.67.1390939310824; Tue, 28 Jan 2014 12:01:50 -0800 (PST) Received: by 10.96.113.37 with HTTP; Tue, 28 Jan 2014 12:01:50 -0800 (PST) In-Reply-To: References: Date: Tue, 28 Jan 2014 12:01:50 -0800 Message-ID: Subject: Re: HDFS Federation address performance issue From: Suresh Srinivas To: "hdfs-user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=001a11c13f08874e2604f10d4b5a X-Virus-Checked: Checked by ClamAV on apache.org --001a11c13f08874e2604f10d4b5a Content-Type: text/plain; charset=US-ASCII Response inline... On Tue, Jan 28, 2014 at 10:04 AM, Anfernee Xu wrote: > Hi, > > Based on > http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/Federation.html#Key_Benefits, > the overall performance can be improved by federation, but I'm not sure > federation address my usercase, could someone elaborate it? > > My usercase is I have one single NM and several DN, and I have bunch of > concurrent MR jobs which will create new files(plan files and > sub-directory) under the same parent directory, the questions are: > > 1) Will these concurrent writes(new file, plan files and sub-directory > under the same parent directory) run in sequential because WRITE-once > control govened by single NM? > Namenode commits multiple requests in a batch. In Namenode it self, the lock for write operations make them sequential. But this is a short duration lock and hence will make from the multiple clients perspective, the creation of files as simultaneous. If you are talking about a single client, with a single thread, then it would be sequential. Hope that makes sense. > > I need this answer to estimate the necessity of moving to HDFS federation. > > Thanks > > -- > --Anfernee > -- http://hortonworks.com/download/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. --001a11c13f08874e2604f10d4b5a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Response inline...


<= div class=3D"gmail_quote">On Tue, Jan 28, 2014 at 10:04 AM, Anfernee Xu <anfernee.xu@gmail.com> wrote:
Hi,

Based on http://hadoop.apache.org/docs/stable/hadoop-project-dist/ha= doop-hdfs/Federation.html#Key_Benefits, the overall performance can be = improved by federation, but I'm not sure federation address my usercase= , could someone elaborate it?

My usercase is I have one single NM and several DN, and I have bu= nch of concurrent MR jobs which will create new files(plan files and sub-di= rectory) under the same parent directory, the questions are:

1) Will= these concurrent writes(new file, plan files and sub-directory under the s= ame parent directory) run in sequential because WRITE-once control govened = by single NM?

Namenode commits multipl= e requests in a batch. In Namenode it self, the lock for write operations m= ake them sequential. But this is a short duration lock and hence will make = from the multiple clients perspective, the creation of files as simultaneou= s.

If you are talking about a single client, with a single thre= ad, then it would be sequential.

Hope that makes sense.

I need this answer to estimate the necessity of moving to HDFS fe= deration.

Thanks
<= div>

--
--Anfernee



--
http://hortonworks.= com/download/

CONFIDENTIALITY NOTICE
NOTICE: This message is = intended for the use of the individual or entity to which it is addressed a= nd may contain information that is confidential, privileged and exempt from= disclosure under applicable law. If the reader of this message is not the = intended recipient, you are hereby notified that any printing, copying, dis= semination, distribution, disclosure or forwarding of this communication is= strictly prohibited. If you have received this communication in error, ple= ase contact the sender immediately and delete it from your system. Thank Yo= u. --001a11c13f08874e2604f10d4b5a--