Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8B805100D6 for ; Mon, 14 Oct 2013 21:53:21 +0000 (UTC) Received: (qmail 67327 invoked by uid 500); 14 Oct 2013 21:53:16 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 67153 invoked by uid 500); 14 Oct 2013 21:53:15 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 67146 invoked by uid 99); 14 Oct 2013 21:53:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Oct 2013 21:53:15 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Sandy.Hider@jhuapl.edu designates 128.244.251.37 as permitted sender) Received: from [128.244.251.37] (HELO piper.jhuapl.edu) (128.244.251.37) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Oct 2013 21:53:06 +0000 Received: from aplexcas2.dom1.jhuapl.edu (aplexcas2.dom1.jhuapl.edu [128.244.198.91]) by piper.jhuapl.edu with smtp (TLS: TLSv1/SSLv3,128bits,RC4-MD5) id 46eb_4ff7_aa7aae9a_b5dd_4fe2_91c5_77c9d48c577a; Mon, 14 Oct 2013 17:52:44 -0400 Received: from aplesstar.dom1.jhuapl.edu ([128.244.198.212]) by aplexcas2.dom1.jhuapl.edu ([128.244.198.91]) with mapi; Mon, 14 Oct 2013 17:49:33 -0400 From: "Hider, Sandy" To: "user@hadoop.apache.org" Date: Mon, 14 Oct 2013 17:49:30 -0400 Subject: Identification of mapper slots Thread-Topic: Identification of mapper slots Thread-Index: Ac7JJ0MdzdWPsNI+R76OkMpQQ4jLCQ== Message-ID: <0A8F66C42F49E448A40E99946404EE5B76D316C2E1@aplesstar.dom1.jhuapl.edu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_0A8F66C42F49E448A40E99946404EE5B76D316C2E1aplesstardom1_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_0A8F66C42F49E448A40E99946404EE5B76D316C2E1aplesstardom1_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable In Hadoop under the mapred-site.conf I can set the maximum number of mappe= rs. For the sake of this email I will call the number of concurrent mappers= : mapper slots. Is it possible to figure out from within the mapper which mapper slot it is= running in? On this project this is important because each mapper has to fork off a Mat= lab runtime compiled executable. The executable is passed in at runtime a = cache to work in. Setting up the cache when given an new directory takes a= long time but can be used again quickly on future calls if provided the sa= me location of the cache. As it turns out when multiple mappers try to us= e the same cache they crash the executable. So ideally if I could identif= y which mapper slot a mapper is running in, I can setup caches for each slo= t and avoid the cache creation time and still guarantee that no two mappers= write to the same cache. Thanks for taking the time to read this, Sandy --_000_0A8F66C42F49E448A40E99946404EE5B76D316C2E1aplesstardom1_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

 

In Hadoop under the= mapred-site.conf  I can set the maximum number of mappers. For the sa= ke of this email I will call the number of concurrent mappers: mapper slots= . 

 

Is it possible to figure out from within the mapper which mapper = slot it is running in?

 

On this project this is important because each map= per has to fork off a Matlab runtime compiled executable.  The executa= ble is passed in at runtime a cache to work in.  Setting up the cache = when given an new directory takes a long time but can be used again quickly= on future calls if provided the same location of the cache.   As= it turns out when multiple mappers try to use the same cache they crash th= e executable.   So ideally if I could identify which mapper slot = a mapper is running in, I can setup caches for each slot and avoid the cach= e creation time and still guarantee that no two mappers write to the same c= ache. 

 

Thanks for taking the time to read this,

 

Sandy

 

 

= --_000_0A8F66C42F49E448A40E99946404EE5B76D316C2E1aplesstardom1_--