crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elliot West <>
Subject Binning operation for the generation of Hive partitioned data
Date Tue, 22 Apr 2014 11:11:22 GMT

I'm evaluating Apache Crunch as a possible replacement for some our data
processing frameworks that run on Hadoop. I can find crunch constructs that
map to most types of operation that we perform in our processes. However,
we frequently bin data by a date field for the purpose of generating
partitioned Hive tables - a fairly common operation I believe. I can't find
a similar binning operation in the crunch user manual and was wondering
if/how this would be achieve with Apache Crunch?

Cheers - Elliot.

View raw message