Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 55936 invoked from network); 22 Dec 2009 04:33:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Dec 2009 04:33:01 -0000 Received: (qmail 58066 invoked by uid 500); 22 Dec 2009 04:32:47 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 58013 invoked by uid 500); 22 Dec 2009 04:32:44 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 58003 invoked by uid 99); 22 Dec 2009 04:32:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 04:32:43 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE,NO_RDNS_DOTCOM_HELO X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [69.147.107.20] (HELO mrout1-b.corp.re1.yahoo.com) (69.147.107.20) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 04:32:32 +0000 Received: from EGL-EX07CAS01.ds.corp.yahoo.com (egl-ex07cas01.eglbp.corp.yahoo.com [203.83.248.208]) by mrout1-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id nBM4VrEV018253 for ; Mon, 21 Dec 2009 20:31:54 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=received:from:to:date:subject:thread-topic:thread-index: message-id:in-reply-to:accept-language:content-language: x-ms-has-attach:x-ms-tnef-correlator:acceptlanguage:content-type:mime-version; b=1y5HrgROHPxKhdKv/SIMby5c+FL+Q36MBhPeK0t/uBT4oInBTFqqwNPjFhvHXAdM Received: from EGL-EX07VS01.ds.corp.yahoo.com ([203.83.248.205]) by EGL-EX07CAS01.ds.corp.yahoo.com ([203.83.248.215]) with mapi; Tue, 22 Dec 2009 10:01:52 +0530 From: Amareshwari Sri Ramadasu To: "general@hadoop.apache.org" Date: Tue, 22 Dec 2009 10:01:52 +0530 Subject: Re: InputFormat related question... Thread-Topic: InputFormat related question... Thread-Index: AcqCZTE+Kk7yIDiHRs+vcGmey7IrsgAWn3E1 Message-ID: In-Reply-To: <1eabbac30912210942w714b1ab7x3aa4b1d4207f667f@mail.gmail.com> Accept-Language: en-US Content-Language: en X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_C7564990EA0Bamarsriyahooinccom_" MIME-Version: 1.0 --_000_C7564990EA0Bamarsriyahooinccom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, If you want map task to process two lines at a time, you need to write a Re= cordReader which constructs two lines per record. LineRecordReader makes on= e line as one record. You can extend NLineInputFormat for generating splits and return your new R= ecordReader for reading records from split. Hope this helps you. Thanks Amareshwari On 12/21/09 11:12 PM, "Something Something" wrot= e: In my application I have a file in this format: The first line of the file contains the data to be processed, and *each* of the remaining lines contain parameters that will be used to slice & dice th= e data in various ways. In other words, each mapper needs two lines - the 1s= t line from this file that contains data and another line that contains parameters. I looked at NLineInputFormat which can be used for "parameter sweeps", but it's not quite what I want. I believe this format returns N no. of consecutive lines to the mapper, correct? What's the best way to handle this case? Do I have to write a special InputFormat class? Please help. Thanks. --_000_C7564990EA0Bamarsriyahooinccom_--