From common-user-return-20201-apmail-hadoop-common-user-archive=hadoop.apache.org@hadoop.apache.org Tue Feb 02 08:57:24 2010 Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 37752 invoked from network); 2 Feb 2010 08:57:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Feb 2010 08:57:24 -0000 Received: (qmail 93524 invoked by uid 500); 2 Feb 2010 08:57:22 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 93446 invoked by uid 500); 2 Feb 2010 08:57:21 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 93435 invoked by uid 99); 2 Feb 2010 08:57:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Feb 2010 08:57:20 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of codazzo@gmail.com designates 209.85.218.224 as permitted sender) Received: from [209.85.218.224] (HELO mail-bw0-f224.google.com) (209.85.218.224) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Feb 2010 08:57:07 +0000 Received: by bwz24 with SMTP id 24so1179641bwz.29 for ; Tue, 02 Feb 2010 00:56:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=bLuqFf5Dg6nWd6MfKdY+IUrdJbSOolAMBR69uUhfzYI=; b=TOVAre2bOoADJACqpTNqoh+12R2yIsh7nW3wqHIhr3MlJsjQ68rVKPahFDAUMYu2Q6 GT0j6gvVuMSTgM7qbmEDOc7r9MEN54x2wDVqwepnA75ejN1cXJNcVk/r28QhsdNndPlH 4ms7cJ/RgUefZaWvN0xGkvWlK1IFlgJ179fLY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=M/EVt+FSAo2/+wtcEh1/eIC9CmUiblUGhjwlVbcPEimigXbQBuBnCH1idjN91A93r/ 0qwaYCRVlkOKB2KFXQ4z9IYEGBlDnp4tU/3PHRrw7q6N9Cu5gHkcx0DPLmEVus52/kGy Qq2+Sf3IL2VSeZ/R5mFRlnclavKzhy/1c2I30= MIME-Version: 1.0 Received: by 10.204.34.208 with SMTP id m16mr3869063bkd.180.1265101007285; Tue, 02 Feb 2010 00:56:47 -0800 (PST) In-Reply-To: <2aa3aff81001290504k5d790953kd9c818248a36e4a8@mail.gmail.com> References: <2aa3aff81001290504k5d790953kd9c818248a36e4a8@mail.gmail.com> Date: Tue, 2 Feb 2010 09:56:47 +0100 Message-ID: <4c290d171002020056i2147f054rcf2977c1495d4993@mail.gmail.com> Subject: Re: Using custom Input Format From: "Antonio D'Ettole" To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000325557b5a24ee74047e9a48fb X-Virus-Checked: Checked by ClamAV on apache.org --000325557b5a24ee74047e9a48fb Content-Type: text/plain; charset=ISO-8859-1 Rakhi, I've recently had to implement a custom InputFormat that's pretty basic (every split is a list of integers basically). You can check it out here http://github.com/codazzo/MultiRow One guy also implemented a custom InputFormat and wrote about it on his blog http://codedemigod.com/blog/?p=120 Hope that helps. Antonio On Fri, Jan 29, 2010 at 2:04 PM, Rakhi Khatwani wrote: > Hi, > i have been trying to implement custom input and output formats. i > was successful enough in creating an custom output format. but whn i call a > mapreduce method which takes in a file, using custom input format, i get an > exception. > java.lang.NullPointerException > at Beans.Content.write(Content.java:54) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:613) > at > CustomInputFormat.SampleCustomInputMap.map(SampleCustomInputMap.java:32) > at CustomInputFormat.SampleCustomInputMap.map(SampleCustomInputMap.java:1) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) > 0 Hello World0 Hello World0 Hello World0 > > id:: null media:: null url:: null content::null > > id:: null media:: null url:: null content::null > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) > at > CustomInputFormat.SampleCustomInputMapReduce.run(SampleCustomInputMapReduce.java:53) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > CustomInputFormat.SampleCustomInputMapReduce.main(SampleCustomInputMapReduce.java:59) > > > I have attached the following files > Content => Custom object that implements writercomparible > ContentInputFormatV2 => inputformat which implements > SequenceFileInputFormat > ContentRecordReader => implementation of RecordReader (Not required though, > it should work w/o it i assume). > SampleCustomInputMap => mapper class > SampleCustomInputReduce => reducer class > SampleCustomInputMapReduce => class which contains the main method and the > job configurations. > data and index => my inputfiles for the main function > For the 1st record it works fine, but for next record and records after > that, i get null. where could i go wrong?? > Regards, > Raakhi > --000325557b5a24ee74047e9a48fb--