incubator-bigtop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roman Shaposhnik (JIRA)" <>
Subject [jira] [Created] (BIGTOP-669) Add DataFu to Bigtop distribution
Date Thu, 05 Jul 2012 17:07:34 GMT
Roman Shaposhnik created BIGTOP-669:

             Summary: Add DataFu to Bigtop distribution
                 Key: BIGTOP-669
             Project: Bigtop
          Issue Type: Bug
          Components: General
    Affects Versions: 0.4.0
            Reporter: Roman Shaposhnik
            Assignee: Roman Shaposhnik
             Fix For: 0.4.0, 0.5.0

DataFu is a collection of user-defined functions for working with large-scale data in Hadoop
and Pig. This library was born out of the need for a stable, well-tested library of UDFs for
data mining and statistics. It is used at LinkedIn in many of our off-line workflows for data
derived products like "People You May Know" and "Skills".

DataFu is available under the Apache License v2 from their GitHub project page:

The latest release of DataFu is: 0.0.4

Note: this will also open up a possibility for Bigtop to start collecting custom UDF implementations
for other projects like Hive, etc.  For now, I simply propose and extra package called pig-udf-datafu

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message