Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 18410 invoked from network); 19 Sep 2009 05:55:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 Sep 2009 05:55:41 -0000 Received: (qmail 74219 invoked by uid 500); 19 Sep 2009 05:55:41 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 74158 invoked by uid 500); 19 Sep 2009 05:55:41 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 74148 invoked by uid 99); 19 Sep 2009 05:55:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 19 Sep 2009 05:55:41 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 19 Sep 2009 05:55:37 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 24208234C044 for ; Fri, 18 Sep 2009 22:55:16 -0700 (PDT) Message-ID: <2028431003.1253339716128.JavaMail.jira@brutus> Date: Fri, 18 Sep 2009 22:55:16 -0700 (PDT) From: "Hong Tang (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator In-Reply-To: <554593849.1247003174898.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-728: -------------------------------- Attachment: (was: mapreduce-728-20090918-6.patch) > Mumak: Map-Reduce Simulator > --------------------------- > > Key: MAPREDUCE-728 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-728 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Affects Versions: 0.21.0 > Reporter: Arun C Murthy > Assignee: Hong Tang > Fix For: 0.21.0 > > Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, mapreduce-728-20090918-6.patch, mapreduce-728-20090918.patch, mumak.png > > > h3. Vision: > We want to build a Simulator to simulate large-scale Hadoop clusters, applications and workloads. This would be invaluable in furthering Hadoop by providing a tool for researchers and developers to prototype features (e.g. pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict their behaviour and performance with reasonable amount of confidence, there-by aiding rapid innovation. > ---- > h3. First Cut: Simulator for the Map-Reduce Scheduler > The Map-Reduce Scheduler is a fertile area of interest with at least four schedulers, each with their own set of features, currently in existence: Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority Scheduler. > Each scheduler's scheduling decisions are driven by many factors, such as fairness, capacity guarantee, resource availability, data-locality etc. > Given that, it is non-trivial to accurately choose a single scheduler or even a set of desired features to predict the right scheduler (or features) for a given workload. Hence a simulator which can predict how well a particular scheduler works for some specific workload by quickly iterating over schedulers and/or scheduler features would be quite useful. > So, the first cut is to implement a simulator for the Map-Reduce scheduler which take as input a job trace derived from production workload and a cluster definition, and simulates the execution of the jobs in as defined in the trace in this virtual cluster. As output, the detailed job execution trace (recorded in relation to virtual simulated time) could then be analyzed to understand various traits of individual schedulers (individual jobs turn around time, throughput, faireness, capacity guarantee, etc). To support this, we would need a simulator which could accurately model the conditions of the actual system which would affect a schedulers decisions. These include very large-scale clusters (thousands of nodes), the detailed characteristics of the workload thrown at the clusters, job or task failures, data locality, and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.