Return-Path: X-Original-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5C5460F3 for ; Wed, 6 Jul 2011 11:00:50 +0000 (UTC) Received: (qmail 89092 invoked by uid 500); 6 Jul 2011 11:00:49 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 88752 invoked by uid 500); 6 Jul 2011 11:00:42 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 88733 invoked by uid 99); 6 Jul 2011 11:00:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jul 2011 11:00:40 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jul 2011 11:00:38 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id DC6A245326 for ; Wed, 6 Jul 2011 11:00:16 +0000 (UTC) Date: Wed, 6 Jul 2011 11:00:16 +0000 (UTC) From: "Devaraj K (JIRA)" To: mapreduce-dev@hadoop.apache.org Message-ID: <1471665278.3528.1309950016899.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (MAPREDUCE-2647) Memory sharing across all the Tasks in the Task Tracker to improve the job performance MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org Memory sharing across all the Tasks in the Task Tracker to improve the job performance -------------------------------------------------------------------------------------- Key: MAPREDUCE-2647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2647 Project: Hadoop Map/Reduce Issue Type: New Feature Components: tasktracker Reporter: Devaraj K Assignee: Devaraj K If all the tasks (maps/reduces) are using (working with) the same additional data to execute the map/reduce task, each task should load the data into memory individually and read the data. It is the additional effort for all the tasks to do the same job. Instead of loading the data by each task, data can be loaded into main memory and it can be used to execute all the tasks. h5.Proposed Solution: 1. Provide a mechanism to load the data into shared memory and to read that data from main memory. 2. We can provide a java API, which internally uses the native implementation to read the data from the memory. All the maps/reducers can this API for reading the data from the main memory. h5.Example: Suppose in a map task, ip address is a key and it needs to get location of the ip address from a local file. In this case each map task should load the file into main memory and read from it and close it. It takes some time to open, read from the file and process every time. Instead of this, we can load the file in the task tracker memory and each task can read from the memory directly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira