hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Loddengaard <a...@cloudera.com>
Subject Re: parsing open xml
Date Sat, 13 Jun 2009 00:11:16 GMT
When you refer to "filesystem," do you mean HDFS?

It's very common to store lots of text files in HDFS and run multiple jobs
to process / learn about those text files.  As for XML support, you can use
Java libraries (or Python libraries if you're using Hadoop streaming) to
parse the XML; Hadoop itself doesn't have much XML support.  I hope this
answers your question.

Alex

On Fri, Jun 12, 2009 at 1:31 PM, Alexandre Jaquet <alexjaquet@gmail.com>wrote:

> Hi,
>
> Does hadoop and map / reduce will allow me to parse large quantity of open
> xml files distributed inside the same filesystem but using multipe jobs ?
>
> Thx
>
> Alexandre Jaquet
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message