crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Tzolov <christian.tzo...@gmail.com>
Subject Re: Crunch XML Source
Date Mon, 26 Jan 2015 01:04:06 GMT
Here is the related jira ticket:
https://issues.apache.org/jira/browse/CRUNCH-491

Btw i've got your book (up)

On Sun, Jan 25, 2015 at 5:14 PM, Josh Wills <jwills@cloudera.com> wrote:

> None that I know of-- I had to do the same thing to parse some XML data in
> a couple of chapters of the Spark book we were writing. Would obviously
> love to have that in crunch-core.
>
> J
>
> On Sun, Jan 25, 2015 at 5:20 AM, Christian Tzolov <
> christian.tzolov@gmail.com> wrote:
>
> > Hi there,
> >
> > Recently I had to ingest some Xml formatted data. I couldn't find related
> > topic in the mailing lists so i've implemented a Crunch XmlSource (
> > https://github.com/tzolov/crunch-xmlsource) reusing the Mahout's
> > XmlInputFormat/XmlRecordReader implementations.
> >
> > Are there any alternative approaches?
> >
> > Apologies if this topic has been discussed already!
> >
> > Cheers,
> > Chris
> >
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message