hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ranjini Rathinam <ranjinibe...@gmail.com>
Subject Re: Hadoop-MapReduce
Date Wed, 11 Dec 2013 07:00:40 GMT
hi,

I have fixed the error , the code is running fine, but this code just split
the part of the tag.

i want to convert into text format so that i can load them into tables of
hbase and hive.

I have used the DOM Parser but this parser uses File as Object  but hdfs
uses FileSystem.

Eg,

File fXmlFile = new File("D:/elango/test.xml");

 System.out.println(g);
 DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
 DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
 Document doc = dBuilder.parse(fXmlFile);


This cant be used as hdfs, because hdfs path  is accessed through
FileSystem.

I kindly request u to , Please suggest me to fix the above issue.

Thanks in advance

Ranjini R




On Tue, Dec 10, 2013 at 11:07 AM, Ranjini Rathinam
<ranjinibecse@gmail.com>wrote:

>
>
>  ---------- Forwarded message ----------
> From: Shekhar Sharma <shekhar2581@gmail.com>
> Date: Mon, Dec 9, 2013 at 10:23 PM
> Subject: Re: Hadoop-MapReduce
> To: user@hadoop.apache.org
>  Cc: ssanyal@datameer.com
>
>
> It does work i have used it long back..
>
> BTW if it is not working, write the custom input format and implement
> your record reader. That would be far more easy than breaking your
> head with others code.
>
> Break your problem in step:
>
> (1) First the XML data is multiline...Meaning multiple lines makes a
> single record for you...May be a record for you would be
>
> <person>
>  <fname>x</fname>
>   <lname>y</lname>
> </person>
>
> (2) Implement a record reader that looks out for the starting and
> ending person tag ( Checkout how RecordReader.java is written)
>
> (3) Once you got the contents between starting and ending tag, now you
> can use a xml parser to parse the contents into an java object and
> form your own key value pairs ( custom key and custom value)
>
>
> Hope you have enough pointers to write the code.
>
>
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
>  On Mon, Dec 9, 2013 at 6:30 PM, Ranjini Rathinam <ranjinibecse@gmail.com>
> wrote:
> > Hi Subroto Sanyal,
> >
> > The link  provided about xml, it does not work . The Class written
> > XmlContent is not allowed in the XmlInputFormat.
> >
> > I request you to help , whether this scenaio some one has coded, and
> needed
> > working code.
> >
> > I have written using SAX Parser too, but eventhough the jars are added in
> > classpath THe error is is coming has NoClasFoung Exception.
> >
> > Please provide sample code for the same.
> >
> > Thanks in advance,
> > Ranjini.R
> >
> > On Mon, Dec 9, 2013 at 12:34 PM, Ranjini Rathinam <
> ranjinibecse@gmail.com>
> > wrote:
> >>
> >>
> >>>> Hi,
> >>>>
> >>>> As suggest by the link below , i have used for my program ,
> >>>>
> >>>> but i am facing the below issues, please help me to fix these error.
> >>>>
> >>>>
> >>>> XmlReader.java:8: XmlReader.Map is not abstract and does not override
> >>>> abstract method
> >>>>
> map(org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
> >>>> in org.apache.hadoop.mapred.Mapper
> >>>>  public static class Map extends MapReduceBase implements Mapper
> >>>> <LongWritable, Text, Text, Text> {
> >>>>                ^
> >>>> ./XmlInputFormat.java:16: XmlInputFormat.XmlRecordReader is not
> abstract
> >>>> and does not override abstract method
> >>>> next(java.lang.Object,java.lang.Object) in
> >>>> org.apache.hadoop.mapred.RecordReader
> >>>> public class XmlRecordReader implements RecordReader {
> >>>>        ^
> >>>> Note: XmlReader.java uses unchecked or unsafe operations.
> >>>> Note: Recompile with -Xlint:unchecked for details.
> >>>> 2 errors
> >>>>
> >>>>
> >>>> i am using hadoop 0.20 version and java 1.6 .
> >>>>
> >>>> Please suggest.
> >>>>
> >>>> Thanks in advance.
> >>>>
> >>>> Regrads,
> >>>> Ranjini. R
> >>>> On Mon, Dec 9, 2013 at 11:08 AM, Ranjini Rathinam
> >>>> <ranjinibecse@gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> ---------- Forwarded message ----------
> >>>>> From: Subroto <ssanyal@datameer.com>
> >>>>> Date: Fri, Dec 6, 2013 at 4:42 PM
> >>>>> Subject: Re: Hadoop-MapReduce
> >>>>> To: user@hadoop.apache.org
> >>>>>
> >>>>>
> >>>>> Hi Ranjini,
> >>>>>
> >>>>> A good example to look into :
> >>>>> http://www.undercloud.org/?p=408
> >>>>>
> >>>>> Cheers,
> >>>>> Subroto Sanyal
> >>>>>
> >>>>> On Dec 6, 2013, at 12:02 PM, Ranjini Rathinam wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> How to read xml file via mapreduce and load them in hbase and hive
> >>>>> using java.
> >>>>>
> >>>>> Please provide sample code.
> >>>>>
> >>>>> I am using hadoop 0.20 version and java 1.6. Which parser version
> >>>>> should be used.
> >>>>>
> >>>>> Thanks in advance.
> >>>>>
> >>>>> Ranjini
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
>

Mime
View raw message