Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 78136 invoked from network); 15 Dec 2010 10:14:34 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Dec 2010 10:14:34 -0000 Received: (qmail 25848 invoked by uid 500); 15 Dec 2010 10:14:32 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 25613 invoked by uid 500); 15 Dec 2010 10:14:32 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 25605 invoked by uid 99); 15 Dec 2010 10:14:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Dec 2010 10:14:31 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of maha@umail.ucsb.edu designates 128.111.151.62 as permitted sender) Received: from [128.111.151.62] (HELO outgoing-2.umail.ucsb.edu) (128.111.151.62) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Dec 2010 10:14:21 +0000 Received: from resnet-32-224.resnet.ucsb.edu ([169.231.32.224] helo=[192.168.1.118]) by outgoing-2.umail.ucsb.edu with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1PSoMy-0003Rs-5m for common-user@hadoop.apache.org; Wed, 15 Dec 2010 02:14:00 -0800 From: maha Content-Type: multipart/alternative; boundary=Apple-Mail-5-568227137 Subject: Deprecated ... damaged? Date: Wed, 15 Dec 2010 02:13:59 -0800 Message-Id: <71743C3E-C6A3-4B57-98FF-4F58D5720DCB@umail.ucsb.edu> To: common-user Mime-Version: 1.0 (Apple Message framework v1082) X-Mailer: Apple Mail (2.1082) X-Virus-Scanned: (umail.ucsb.edu) Clam AV found no viruses in this message X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-5-568227137 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi everyone, Using Hadoop-0.20.2, I'm trying to use MultiFileInputFormat which is = supposed to put each file from the input directory in a SEPARATE split. = So the number of Maps is equal to the number of input files. Yet, what I = get is that each split contains multiple paths of input files, hence # = of maps is < # of input files. Is it because "MultiFileInputFormat" is = deprecated? In my implemented myMultiFileInputFormat I have only the following: public RecordReader getRecordReader(InputSplit = split, JobConf job, Reporter reporter){ return (new myRecordReader((MultiFileSplit) split)); } Yet, in myRecordReader, for example one split has the following; =20 " /tmp/input/file1:0+300 /tmp/input/file2:0+199 " instead of each line in its own split. Why? Any clues? Thank you, Maha =20= --Apple-Mail-5-568227137--