crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Lewi <jer...@lewi.us>
Subject Crunch and Contrail
Date Sun, 08 Dec 2013 19:30:51 GMT
Hi Crunch Users and Developers

Just wanted to let you know that we're starting to explore the use of
Crunch for Contrail.
(http://sourceforge.net/apps/mediawiki/contrail-bio/index.php?title=Contrail).
Contrail is a bioinformatics application written on top of Hadoop. Since
the algorithm includes several sequences of MR jobs Crunch would be very
useful both because of its more convenient programming model as well as the
potential for improved execution performance through pipeline optimization.

I hit what appeared to be some compatibility issues with version 0.8.0 and
Hadoop 1.2.1 but building Crunch from HEAD seemed to fix this.

Our first example of Crunch was a simple word count like pipeline for
collecting graph statistics. This was a breeze to write especially compared
to writing the equivalent MR jobs.

Contrail uses Avro extensively so Crunch's Avro support is critical for us.

Thanks
Jeremy

Mime
View raw message