crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <>
Subject Re: Output Sequence Files into ORC
Date Mon, 14 Sep 2015 20:50:52 GMT

You might look at the OrcSourceTarget integration tests[1].  I'm not an
expert at OrcFiles but looks like it has a few examples for reading/writing

[1] -

On Mon, Sep 14, 2015 at 8:29 AM, Ben Watson <> wrote:

> Hi all,
> I'm trying to write a simple converter in Crunch to turn Sequence files
> into ORC files. The only examples that I can find for dealing with ORC
> files are the tutorial at
> and
> then the discussion at
> The tutorial seems to only show how to output data that's already in ORC
> format, which isn't much use for me here.
> It would be nice to be able to output ORC files like you can with Java
> MapReduce -
> - specifying a Struct, parsing each record into some type of object, and
> letting the output do the rest. I've tried to replicate this in Crunch by
> writing a MapFn that basically turns each record into an OrcWritable, but
> it doesn't work, and even if it did I suspect it wouldn't be very efficient.
> Is this something that's already possible that I'm missing?
> Thanks,
> Ben

View raw message