hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-908) Hadoop Abacus, a package for performing simple counting/aggregation
Date Fri, 19 Jan 2007 20:00:30 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Doug Cutting updated HADOOP-908:

       Resolution: Fixed
    Fix Version/s: 0.11.0
           Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Runping!

> Hadoop Abacus, a package for performing simple counting/aggregation
> -------------------------------------------------------------------
>                 Key: HADOOP-908
>                 URL: https://issues.apache.org/jira/browse/HADOOP-908
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/streaming
>            Reporter: Runping Qi
>         Assigned To: Runping Qi
>             Fix For: 0.11.0
>         Attachments: abacus.patch
> Hadoop Abacus package is a specialization of map/reduce framework, 
> specilizing for performing various counting and aggregations. 
> It offers similar functionalities to Google's SawZall. 
> Generally speaking, in order to implement an application using Map/Reduce model, 
> the developer needs to implement Map and Reduce functions (and possibly Combine function).

> However, for a lot of applications related to counting and statistics computing, 
> these functions have very similar characteristics. 
> Abacus abstracts out the general patterns and provides a package implementing those patterns.

> In particular, the package provides a generic mapper class, a reducer class and a combiner
> and a set of built-in value aggregators. It also provides a generic utility class, ValueAggregatorJob
> for creating Abacus jobs.
> To create an Abacus job, the user just needs to implement one plugin class that 
> is responsible for specifying what aggregators to use and what values are for which aggregators.

> The mapper will call this class in the runtime to generate aggregation ids and values.
> The generic  combiner and reducer will aggregate the values associated with the same

> aggregation ids accordingly. Thus, it is much easier to create and run an Abacus job
> a normal map/reduce job. Since a  built-in generic combiner is always used, the execution
is very efficient.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message