Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 8797 invoked from network); 28 Jan 2010 15:36:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Jan 2010 15:36:25 -0000 Received: (qmail 59405 invoked by uid 500); 28 Jan 2010 15:36:22 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 59331 invoked by uid 500); 28 Jan 2010 15:36:22 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 59321 invoked by uid 99); 28 Jan 2010 15:36:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jan 2010 15:36:22 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of edlinuxguru@gmail.com designates 209.85.218.224 as permitted sender) Received: from [209.85.218.224] (HELO mail-bw0-f224.google.com) (209.85.218.224) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jan 2010 15:36:14 +0000 Received: by bwz24 with SMTP id 24so604907bwz.29 for ; Thu, 28 Jan 2010 07:35:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=ByOjYkfPlSDuP9q731SoB9/nB9AJ9L7kkFZPlZ62vXs=; b=qLhx8AHbkvqAhEERZtEmvC5/a4oFxaViFv62Uwa/6IbUJuhz2qUveyBoFcwg3ZNTuI 7cgY5Y7bYJanPFPMevQR+7QO89WTtDm6wrnNin4clIITvupgcXsTHElOrOkayQKBjmzb sAG6m8nOLjevTSm9ato3/FLt2qaq2CU5jVJ24= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=GcVDYdatlhf2btjSxbArEdC+OoYYJqC2qdmOuituk+npdcqBM+sz0MunQwl10aymSu r3H8MG8v/xWaKjG4Jyz9cMXXH4vlb+d5ZcFPjFhwSSTKXsnVJXmW3tZ1MoBPNRTSx+B4 0VnaBVaOpb+80MxEMhTz9BHYmaIy/woi+3YYs= MIME-Version: 1.0 Received: by 10.239.184.82 with SMTP id x18mr1349373hbg.67.1264692953608; Thu, 28 Jan 2010 07:35:53 -0800 (PST) In-Reply-To: <2c36b701001280101k56ef29fex43e355595bbbed3@mail.gmail.com> References: <2c36b701001280101k56ef29fex43e355595bbbed3@mail.gmail.com> Date: Thu, 28 Jan 2010 10:35:53 -0500 Message-ID: Subject: Re: Fileformat query From: Edward Capriolo To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Thu, Jan 28, 2010 at 4:01 AM, Udaya Lakshmi wrote: > Hi all.. > =A0 I have searched the documentation but could not find a input file > format which will give line number as the key and line as the value. > Did I miss something? Can someone give me a clue of how to implement > one such input file format. > > Thanks, > Udaya. > Udaya, When using the standard File Input Format: public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException { key represents the byte offset of the key in the input file. There is no easy way for translate the byte offset to a logical line number, unless all lines were fixed width (not usually the case) Edward