Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8D9041004A for ; Mon, 4 Nov 2013 06:49:45 +0000 (UTC) Received: (qmail 42278 invoked by uid 500); 4 Nov 2013 06:49:35 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 42028 invoked by uid 500); 4 Nov 2013 06:49:31 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 41831 invoked by uid 99); 4 Nov 2013 06:49:22 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Nov 2013 06:49:22 +0000 Date: Mon, 4 Nov 2013 06:49:22 +0000 (UTC) From: "Ming Chen (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Chen updated MAPREDUCE-5605: --------------------------------- Attachment: (was: CacheOutputStream.java) > Memory-centric MapReduce aiming to solve the I/O bottleneck > ----------------------------------------------------------- > > Key: MAPREDUCE-5605 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 1.0.1 > Environment: x86-64 Linux/Unix > jdk7 preferred > Reporter: Ming Chen > Assignee: Ming Chen > Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, M= apOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, MapTas= kCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, Memor= yElement.java, MergeSorter.java, Merger.java, Operation.java, OutputCollect= or.java, OutputCommitter.java, OutputFormat.java, OutputLogFilter.java, Par= titioner.java, RamManager.java, RawBufferedOutputStream.java, RawHistoryFil= eServlet.java, RawKeyValueIterator.java, RecordReader.java, ReduceRamManage= r.java, ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, Rein= itTrackerAction.java, RoundQueue.java, RunningJob.java, SequenceFileOutputF= ormat.java, SpillScheduler.java, Task.java, TaskInProgress.java, TaskLog.ja= va, TaskLogAppender.java, TaskLogServlet.java, TaskLogsTruncater.java, Task= MemoryManagerThread.java, TaskReport.java, TaskRunner.java, TaskScheduler.j= ava, TaskStatus.java, TaskTracker.java, TaskTrackerAction.java, TaskTracker= Instrumentation.java, TaskTrackerStatus.java, TextOutputFormat.java > > > Memory is a very important resource to bridge the gap between CPUs and I/= O devices. So the idea is to maximize the usage of memory to solve the prob= lem of I/O bottleneck. We developed a multi-threaded task execution engine,= which runs in a single JVM on a node. In the execution engine, we have imp= lemented the algorithm of memory scheduling to realize global memory manage= ment, based on which we further developed the techniques such as sequential= disk accessing, multi-cache and solved the problem of full garbage collect= ion in the JVM. We have conducted extensive experiments with comparison aga= inst the native Hadoop platform. The results show that the Mammoth system c= an reduce the job execution time by more than 40% in typical cases, without= requiring any modifications of the Hadoop programs. When a system is short= of memory, Mammoth can improve the performance by up to 4 times, as observ= ed for I/O intensive applications, such as PageRank.=20 -- This message was sent by Atlassian JIRA (v6.1#6144)