hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kayla Jay <kaylai...@yahoo.com>
Subject Re: Map/Reduce with XML files ..
Date Tue, 29 Apr 2008 00:14:10 GMT
Yes, I'm talking about a collection of small xml files stored in "container" files.  I.e there's
a lot and lots of small xml files collected into big files.  Not one gargantuan XML file.
How would you go about using hadoop with splits and processing and handling these sorts of
XML files?


----- Original Message ----
From: Ted Dunning <tdunning@veoh.com>
To: core-user@hadoop.apache.org
Sent: Monday, April 28, 2008 4:16:20 PM
Subject: Re: Map/Reduce with XML files ..


The only real problem with xml and map-reduce is if you are talking about
one gargantuan XML file.  That makes correct splitting difficult.

If you are talking about millions or billions of small xml files (stored in
some sort of container file), then hadoop should be pretty easy to use.


On 4/28/08 9:39 AM, "Kayla Jay" <kaylais30@yahoo.com> wrote:

> Hello
> 
> Has anyone had any experience with processing xml files within Hadoop within
> their maps/reduces?
> In particular, has anyone used any sort of XQuery/XPath processing within
> their maps/reduces?
> Say I have XML string passed to the map and now I want to find something in
> particular via XQuery/XPath or some sort to run numbers on occurrences or
> parse out a particular section within the XML.
> 
> Anyone done any XML processing looking for things within XML?  Then, aggregate
> common pieces together in the reduces ?
> 
> 
> On another note,
> Has anyone figured out splits for XML files?
> Has anyone written a custom XML reader other than the StreamXmlRecordReader?
> The only one I've read about and can find anything is:
> http://www.nabble.com/map-reduce-function-on-xml-string-td15816818.html
> 
> 
> Thanks.
> 
> 
> 
>      
> ______________________________________________________________________________
> ______
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile.  Try it now.
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message