hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eelco Hillenius <eelco.hillen...@gmail.com>
Subject comparing sub/ 3rd party projects that abstract map/reduce
Date Fri, 14 Aug 2009 19:30:09 GMT

Would people mind sharing their opinions about the relative strengths
and weaknesses of Hive, Chukwa and Pig (and possibly other

I'm only just getting into Hadoop, but it seems to me these libs have
a considerable overlap depending on what you use them for? I plan to
check those sub projects out anyway, and I'm definitively not trying
to start a flame war here, but it would be great to hear some opinions
about what areas these projects are particularly useful for and what
might need some work etc.

As a bit of context, we (Teachscape) are considering Hadoop for
storing audit log files and extracting information from them. The
audit logging we do is application specific, e.g. user Foo deleted
Survey X, user Bar moved organization node AAB to AB, etc, and besides
the need to run a couple of fixed reports weekly (mainly that give our
customers some insight in how they are using our application), I
expect us to want to create queries on the fly to e.g. track down
problems. I don't expect our developers to have a problem writing Map/
Reduce programs, but I do like the idea of a higher level way of
extracting information.

Any thoughts would be greatly appreciated,


View raw message