hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Chen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck
Date Sat, 02 Nov 2013 14:19:17 GMT
Ming Chen created MAPREDUCE-5605:

             Summary: Memory-centric MapReduce aiming to solve the I/O bottleneck
                 Key: MAPREDUCE-5605
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
    Affects Versions: 1.0.1
         Environment: x86-64 Linux/Unix
jdk7 preferred
            Reporter: Ming Chen
            Assignee: Ming Chen

Memory is a very important resource to bridge the gap between
CPUs and I/O devices. So the idea is to maximize the usage of memory to solve the problem
of I/O bottleneck. We developed a multi-threaded task execution engine, which runs in a single
JVM on a node. In the execution engine, we have implemented the algorithm of memory scheduling
to realize global memory management, based on which we further developed the techniques such
as sequential disk accessing, multi-cache and solved the problem of full garbage collection
in the JVM. We have conducted extensive experiments with comparison against the native Hadoop
platform. The results show that the Mammoth system can reduce the job execution time by more
than 40% in typical cases, without requiring any modifications of the Hadoop programs. When
a system is short of memory, Mammoth can improve the performance by up to 4 times, as observed
for I/O intensive applications, such as PageRank. 

This message was sent by Atlassian JIRA

View raw message