hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksandr Elbakyan <ramal...@yahoo.com>
Subject Re: Processing xml files
Date Tue, 24 May 2011 23:25:26 GMT
Hello,

 We have the same type of data, we currently convert it to tab delimited file and use it
as input for streaming

Regards,
Aleksandr

--- On Tue, 5/24/11, Mohit Anchlia <mohitanchlia@gmail.com> wrote:

From: Mohit Anchlia <mohitanchlia@gmail.com>
Subject: Processing xml files
To: common-user@hadoop.apache.org
Date: Tuesday, May 24, 2011, 4:16 PM

I just started learning hadoop and got done with wordcount mapreduce
example. I also briefly looked at hadoop streaming.

Some questions
1) What should  be my first step now? Are there more examples
somewhere that I can try out?
2) Second question is around pracitcal usability using xml files. Our
xml files are not big they are around 120k in size but hadoop is
really meant for big files so how do I go about processing these xml
files?
3) Are there any samples or advise on how to processing with xml files?


Looking for help and pointers.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message