Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A4DBCEB7C for ; Fri, 30 Nov 2012 13:06:54 +0000 (UTC) Received: (qmail 64296 invoked by uid 500); 30 Nov 2012 13:06:49 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 64103 invoked by uid 500); 30 Nov 2012 13:06:49 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 64090 invoked by uid 99); 30 Nov 2012 13:06:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Nov 2012 13:06:48 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dechouxb@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qa0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Nov 2012 13:06:40 +0000 Received: by mail-qa0-f48.google.com with SMTP id l8so521834qaq.14 for ; Fri, 30 Nov 2012 05:06:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=QpJnmaEE6VuaGcUpPADzBRwbYXFyw+CL/bEkSWFJNVw=; b=cx4ySJ9ok6gBKKWVwBvb0fHdtTE+UkoebNqqCtHSCMb+sgRWdE5PgQ2RSXo1SV9Xti yatuN3bWxiTIycIt9p9CxpK7ywmlfRnEtEEDotzm6SOp3iw+DOk4lMe0JwGpkFVZL9ar Z19QmKSeYnGrt6DwO1c0OcDqfOj1+eCCiWUMbkeYSHzZt2dXmkG4xXA5HJlVKr46gvbY RYDU6uxkA/wl/X9JxTBWrmR4JJPsxFL5mdkeA8wDWIloC9hwceFIiLu2N/0oexsrusv5 zKkrBnoOy9LFemwKXwUd4sLDkMvAgvfpwA7oRv96nqKtbacaF50sZUrqM93otgg59fIx 7kCg== MIME-Version: 1.0 Received: by 10.224.194.132 with SMTP id dy4mr2899121qab.33.1354280779663; Fri, 30 Nov 2012 05:06:19 -0800 (PST) Received: by 10.49.120.66 with HTTP; Fri, 30 Nov 2012 05:06:19 -0800 (PST) In-Reply-To: References: Date: Fri, 30 Nov 2012 14:06:19 +0100 Message-ID: Subject: Re: Mapper outputs an empty file From: Bertrand Dechoux To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf300e4befcc47bf04cfb6108e X-Virus-Checked: Checked by ClamAV on apache.org --20cf300e4befcc47bf04cfb6108e Content-Type: text/plain; charset=ISO-8859-1 You should write unit tests (MRUnit) and do debugging if that's not enough. I would assume that you are a reading your file line by line. And each line is not a valid xml, thus an exception is thrown and then caught but without any logs or counters. Regards Bertrand On Fri, Nov 30, 2012 at 1:21 PM, dyuti a wrote: > Hi All, > Am trying with xml processing in hadoop,used the below code inside map > method. It results an empty file (not used reducer class).is there anything > wrong ? > > //code used inside map method > public void map(LongWritable key, Text value1,Context context) > throws IOException, InterruptedException { > String xmlString = value1.toString(); > SAXBuilder builder = new SAXBuilder(); > Reader in = new StringReader(xmlString); > String value=""; > try { > Document doc = builder.build(in); > Element rootNode = doc.getRootElement(); > List list = rootNode.getChildren("staff"); > for (int i = 0; i < list.size(); i++) { > Element node = (Element) list.get(i); > String tag1 = node.getChildText("firstname"); > String tag2 = > node.getChildText("lastname"); > String tag3 = node.getChildText("nickname"); > String tag4 = node.getChildText("salary"); > > value = tag1 + "," + tag2 + "," + tag3 + "," + tag4; > context.write(NullWritable.get(), new Text(value)); > } > } followed by catch statements.................... > > //xml input file > > > > yong > mook kim > mkyong > 100000 > > > low121 > yin fong1 > fong fong1 > 2000001 > > > > Thanks for your help! > > Regards, > dti > > -- Bertrand Dechoux --20cf300e4befcc47bf04cfb6108e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable You should write unit tests (MRUnit) and do debugging if that's not eno= ugh.
I would assume that you are a reading your file line by line. And e= ach line is not a valid xml, thus an exception is thrown and then caught bu= t without any logs or counters.

Regards

Bertrand

On Fri, Nov 3= 0, 2012 at 1:21 PM, dyuti a <hadoop.hive04@gmail.com> = wrote:
Hi All,
Am trying= with xml processing in hadoop,used the below code inside map method. It re= sults an empty file (not used reducer class).is there anything=A0
=A0wrong ?=A0

//code used ins= ide map method=A0
public void map(LongWritable key, Text = value1,Context context)=A0
= throws IOException, Interrupte= dException {
=A0 =A0 =A0 =A0 String xmlString =3D va= lue1.toString();
SAXBuilder builder =3D new SAXBuilder();
Reader in =3D new StringReade= r(xmlString);
Strin= g value=3D"";
try {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 Document doc =3D builder.build(in);
= Element rootNode =3D doc.getRootElement();
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 List<Elemen= t> list =3D rootNode.getChildren("staff");
=A0 =A0 =A0 =A0 =A0 =A0 = for (int i =3D 0; i < list.size(); i++) {
Element node =3D (Element) list.get(i);
=A0 =A0 =A0 =A0 = =A0 =A0 =A0=A0String tag1 =3D node.getChildText("firstn= ame");
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 String tag2 =3D node.getChildText("lastname");
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0 String tag3 =3D node.getCh= ildText("nickname");
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0 String tag4 =3D node.getChildText("salary")= ;
=A0 =A0 =A0 =A0=A0
value =3D tag1 + "," + tag2 + &quo= t;," + tag3 + "," + tag4;
context= .write(NullWritable.get(), new Text(value));
}=
}=A0followed by catch statements.= ...................

//xml input file
<?xml version=3D"1.0= " encoding=3D"UTF-8"?>
<company>
<= /span><staff>
<firstname>yong</firstname&= gt;
= <lastname>mook kim</lastname>
<nic= kname>mkyong</nickname>
= <salary>100000</salary>
</staff>=
<= /span><staff>
<firstname>low121</firstnam= e>
= <lastname>yin fong1</lastname>
<ni= ckname>fong fong1</nickname>
= <salary>2000001</salary>
</staff>= ;
</company>

Thanks for your help!

Regards,
dti<= /span>




--
Bertrand Dechoux
--20cf300e4befcc47bf04cfb6108e--