hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shekhar Sharma <shekhar2...@gmail.com>
Subject Re: Hadoop-MapReduce
Date Mon, 09 Dec 2013 16:53:49 GMT
It does work i have used it long back..

BTW if it is not working, write the custom input format and implement
your record reader. That would be far more easy than breaking your
head with others code.

Break your problem in step:

(1) First the XML data is multiline...Meaning multiple lines makes a
single record for you...May be a record for you would be

<person>
 <fname>x</fname>
  <lname>y</lname>
</person>

(2) Implement a record reader that looks out for the starting and
ending person tag ( Checkout how RecordReader.java is written)

(3) Once you got the contents between starting and ending tag, now you
can use a xml parser to parse the contents into an java object and
form your own key value pairs ( custom key and custom value)


Hope you have enough pointers to write the code.


Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 9, 2013 at 6:30 PM, Ranjini Rathinam <ranjinibecse@gmail.com> wrote:
> Hi Subroto Sanyal,
>
> The link  provided about xml, it does not work . The Class written
> XmlContent is not allowed in the XmlInputFormat.
>
> I request you to help , whether this scenaio some one has coded, and needed
> working code.
>
> I have written using SAX Parser too, but eventhough the jars are added in
> classpath THe error is is coming has NoClasFoung Exception.
>
> Please provide sample code for the same.
>
> Thanks in advance,
> Ranjini.R
>
> On Mon, Dec 9, 2013 at 12:34 PM, Ranjini Rathinam <ranjinibecse@gmail.com>
> wrote:
>>
>>
>>>> Hi,
>>>>
>>>> As suggest by the link below , i have used for my program ,
>>>>
>>>> but i am facing the below issues, please help me to fix these error.
>>>>
>>>>
>>>> XmlReader.java:8: XmlReader.Map is not abstract and does not override
>>>> abstract method
>>>> map(org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
>>>> in org.apache.hadoop.mapred.Mapper
>>>>  public static class Map extends MapReduceBase implements Mapper
>>>> <LongWritable, Text, Text, Text> {
>>>>                ^
>>>> ./XmlInputFormat.java:16: XmlInputFormat.XmlRecordReader is not abstract
>>>> and does not override abstract method
>>>> next(java.lang.Object,java.lang.Object) in
>>>> org.apache.hadoop.mapred.RecordReader
>>>> public class XmlRecordReader implements RecordReader {
>>>>        ^
>>>> Note: XmlReader.java uses unchecked or unsafe operations.
>>>> Note: Recompile with -Xlint:unchecked for details.
>>>> 2 errors
>>>>
>>>>
>>>> i am using hadoop 0.20 version and java 1.6 .
>>>>
>>>> Please suggest.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regrads,
>>>> Ranjini. R
>>>> On Mon, Dec 9, 2013 at 11:08 AM, Ranjini Rathinam
>>>> <ranjinibecse@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Subroto <ssanyal@datameer.com>
>>>>> Date: Fri, Dec 6, 2013 at 4:42 PM
>>>>> Subject: Re: Hadoop-MapReduce
>>>>> To: user@hadoop.apache.org
>>>>>
>>>>>
>>>>> Hi Ranjini,
>>>>>
>>>>> A good example to look into :
>>>>> http://www.undercloud.org/?p=408
>>>>>
>>>>> Cheers,
>>>>> Subroto Sanyal
>>>>>
>>>>> On Dec 6, 2013, at 12:02 PM, Ranjini Rathinam wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> How to read xml file via mapreduce and load them in hbase and hive
>>>>> using java.
>>>>>
>>>>> Please provide sample code.
>>>>>
>>>>> I am using hadoop 0.20 version and java 1.6. Which parser version
>>>>> should be used.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Ranjini
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message