Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 34844 invoked from network); 18 Feb 2011 22:08:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Feb 2011 22:08:14 -0000 Received: (qmail 13784 invoked by uid 500); 18 Feb 2011 22:08:11 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 13723 invoked by uid 500); 18 Feb 2011 22:08:11 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 13714 invoked by uid 99); 18 Feb 2011 22:08:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Feb 2011 22:08:11 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of maha@umail.ucsb.edu designates 128.111.151.62 as permitted sender) Received: from [128.111.151.62] (HELO outgoing-2.umail.ucsb.edu) (128.111.151.62) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Feb 2011 22:08:03 +0000 Received: from resnet-32-224.resnet.ucsb.edu ([169.231.32.224] helo=[192.168.1.137]) by outgoing-2.umail.ucsb.edu with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.72) (envelope-from ) id 1PqYUI-0001pU-IR for common-user@hadoop.apache.org; Fri, 18 Feb 2011 14:07:42 -0800 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) Subject: Re: Quick question From: maha In-Reply-To: <12CB02B10103F9488CBF215549B77EA80CC25E@PVSWMAIL2010.pervasive.com> Date: Fri, 18 Feb 2011 14:07:42 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <35B0AEF4-C712-4DE8-888E-86A17846B8DD@umail.ucsb.edu> References: <12CB02B10103F9488CBF215549B77EA80CC25E@PVSWMAIL2010.pervasive.com> To: common-user@hadoop.apache.org X-Mailer: Apple Mail (2.1082) X-Virus-Scanned: (umail.ucsb.edu) Clam AV found no viruses in this message Thanks Ted and Jim :) Maha On Feb 18, 2011, at 11:55 AM, Jim Falgout wrote: > That's right. The TextInputFormat handles situations where records = cross split boundaries. What your mapper will see is "whole" records.=20 >=20 > -----Original Message----- > From: maha [mailto:maha@umail.ucsb.edu]=20 > Sent: Friday, February 18, 2011 1:14 PM > To: common-user > Subject: Quick question >=20 > Hi all, >=20 > I want to check if the following statement is right: >=20 > If I use TextInputFormat to process a text file with 2000 lines (each = ending with \n) with 20 mappers. Then each map will have a sequence of = COMPLETE LINES .=20 >=20 > In other words, the input is not split byte-wise but by lines.=20 >=20 > Is that right? >=20 >=20 > Thank you, > Maha >=20