hama-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Trivial Update of "Partitioning" by edwardyoon
Date Tue, 08 Jan 2013 09:06:10 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The "Partitioning" page has been changed by edwardyoon:
http://wiki.apache.org/hama/Partitioning?action=diff&rev1=4&rev2=5

  == User-defined partitioning ==
  
- The partitioner is designed for determining how to distribute the input data among computing
workers of a Bulk Synchronous Parallel processing. Remember, this is not related with output
collection, unlike Map/Reduce's partition function.
+ In Hama BSP computing framework, the Partition function is used for obtaining scalability
of a Bulk Synchronous Parallel processing, and determining how to distribute the slices of
input data among BSP processors. Unlike MapReduce data processing model, many scientific algorithms
based on Message-Passing Bulk Synchronous Parallel model often requires that a processor obtain
“nearby or related” data from other processors in order to complete the processing. In
this case, processors determine their communication partners, or neighbors using Partition
function.
  
- Input data-partitioning works as following sequence:
+ Internally, Input data-partitioning works as following sequence:
  
   * If user specified partition function, internally, "partitioning job" is ran as a pre-processing
step.
    * Each task of "partitioning job" reads its assigned data block and rewrite them to particular
partition files.

Mime
View raw message