From mapreduce-dev-return-550-apmail-hadoop-mapreduce-dev-archive=hadoop.apache.org@hadoop.apache.org Tue Jul 28 12:12:22 2009 Return-Path: Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: (qmail 15058 invoked from network); 28 Jul 2009 12:12:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Jul 2009 12:12:22 -0000 Received: (qmail 31406 invoked by uid 500); 28 Jul 2009 12:13:39 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 31301 invoked by uid 500); 28 Jul 2009 12:13:39 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 31275 invoked by uid 99); 28 Jul 2009 12:13:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jul 2009 12:13:39 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jul 2009 12:13:36 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E8AF5234C1E7 for ; Tue, 28 Jul 2009 05:13:14 -0700 (PDT) Message-ID: <1262453582.1248783194952.JavaMail.jira@brutus> Date: Tue, 28 Jul 2009 05:13:14 -0700 (PDT) From: "Amar Kamat (JIRA)" To: mapreduce-dev@hadoop.apache.org Subject: [jira] Created: (MAPREDUCE-812) Total number of splits/maps can be encoded as the first field while serializing splits MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org Total number of splits/maps can be encoded as the first field while serializing splits -------------------------------------------------------------------------------------- Key: MAPREDUCE-812 URL: https://issues.apache.org/jira/browse/MAPREDUCE-812 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Amar Kamat To find out the total number of maps, the whole split file is deserialized and then the checks are made (num-maps = length of the split array). The issue is that if total number of splits is more then unnecessarily load all the splits and then discard it. Instead we can encode the total number of splits as the first field. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.