Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 38907 invoked from network); 14 Feb 2006 22:55:34 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 14 Feb 2006 22:55:34 -0000 Received: (qmail 46434 invoked by uid 500); 14 Feb 2006 22:55:32 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 46386 invoked by uid 500); 14 Feb 2006 22:55:32 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 46355 invoked by uid 99); 14 Feb 2006 22:55:32 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Feb 2006 14:55:30 -0800 Received: from ajax.apache.org (ajax.apache.org [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id 11DE1DC for ; Tue, 14 Feb 2006 23:55:09 +0100 (CET) Message-ID: <1296449919.1139957709049.JavaMail.jira@ajax.apache.org> Date: Tue, 14 Feb 2006 23:55:09 +0100 (CET) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-38) default splitter should incorporate fs block size In-Reply-To: <1089400937.1139945203207.JavaMail.jira@ajax.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-38?page=comments#action_12366401 ] Doug Cutting commented on HADOOP-38: ------------------------------------ I'm not sure what RAM size or network speeds have to do with it: we stream blocks into a task, we don't read them all at once. Restartability could be an issue. If you have a petabyte of input, and you want restartability at 100M chunks, then that means you need to be able to support up to 10M tasks per job. This is possible, but means the job tracker has to be even more careful not to store too much in RAM per task, nor iterate over all tasks, etc. But I'm not convinced that 1GB would cause problems for restartability. A petabyte input on 10k nodes (my current horizon), 1GB blocks gives 1M tasks, or 100 per node. So each task will average around 1% of the execution time, so even those that are restarted near the end of job completion won't add much to the overall time. > default splitter should incorporate fs block size > ------------------------------------------------- > > Key: HADOOP-38 > URL: http://issues.apache.org/jira/browse/HADOOP-38 > Project: Hadoop > Type: Improvement > Components: mapred > Reporter: Doug Cutting > > By default, the file splitting code should operate as follows. > inputs are *, numMapTasks, minSplitSize, fsBlockSize > output is * > totalSize = sum of all file sizes; > desiredSplitSize = totalSize / numMapTasks; > if (desiredSplitSize > fsBlockSize) /* new */ > desiredSplitSize = fsBlockSize; > if (desiredSplitSize < minSplitSize) > desiredSplitSize = minSplitSize; > chop input files into desiredSplitSize chunks & return them > In other words, the numMapTasks is a desired minimum. We'll try to chop input into at least numMapTasks chunks, each ideally a single fs block. > If there's not enough input data to create numMapTasks tasks, each with an entire block, then we'll permit tasks whose input is smaller than a filesystem block, down to a minimum split size. > This handles cases where: > - each input record takes a lot of time to process. In this case we want to make sure we use all of the cluster. Thus it is important to permit splits smaller than the fs block size. > - input i/o dominates. In this case we want to permit the placement of tasks on hosts where their data is local. This is only possible if splits are fs block size or smaller. > Are there other common cases that this algorithm does not handle well? > The part marked 'new' above is not currently implemented, but I'd like to add it. > Does this sound reasonble? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira