hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: [jira] [Commented] (HAMA-883) [Research Task] Massive log event aggregation in real time using Apache Hama
Date Fri, 11 Apr 2014 05:40:38 GMT
Yesterday, I had survey the Storm. Storm's task grouping and chainable
bolts seems pretty nice (especially, chainable bolts can be really
useful in case of real-time join operation).

I think, we can also implement similar functions of Storm's task
grouping and chainable bolts on BSP. My rough idea is:

1. Launches multi-tasks per node (as number of group of Bolts). For example:

+---------------+
|    Server1    |
+---------------+
Task-1. tailing bolt
Task-2. split sentence bolt
Task-3. wordcount bolt

2. Assign the tasks to proper group.
--
3. Each task executes their user-defined function and sends messages
to task of next group.
4. Synchronizes all.
--
5. Finally, repeat the above 3 ~ 4 process.

In here, only the difficult one is how to determine the task group at
initial superstep. So, I'd like to add below one to BSPPeer interface.

  /**
   * @return the names of locally adjacent peers (including this peer).
   */
  public String[] getAdjacentPeerNames();


On Thu, Apr 3, 2014 at 11:00 AM, Yexi Jiang <yexijiang@gmail.com> wrote:
> great~
>
>
> 2014-04-02 21:43 GMT-04:00 Edward J. Yoon (JIRA) <jira@apache.org>:
>
>>
>>     [
>> https://issues.apache.org/jira/browse/HAMA-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958430#comment-13958430]
>>
>> Edward J. Yoon commented on HAMA-883:
>> -------------------------------------
>>
>> NOTE: my fellow worker is currently working on this issue -
>> https://github.com/garudakang/meerkat
>>
>> > [Research Task] Massive log event aggregation in real time using Apache
>> Hama
>> >
>> ----------------------------------------------------------------------------
>> >
>> >                 Key: HAMA-883
>> >                 URL: https://issues.apache.org/jira/browse/HAMA-883
>> >             Project: Hama
>> >          Issue Type: Task
>> >            Reporter: Edward J. Yoon
>> >
>> > BSP tasks can be used for aggregating log data streamed in real time.
>> With this research task, we might able to platformization these kind of
>> processing.
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.2#6252)
>>
>
>
>
> --
> ------
> Yexi Jiang,
> ECS 251,  yjian004@cs.fiu.edu
> School of Computer and Information Science,
> Florida International University
> Homepage: http://users.cis.fiu.edu/~yjian004/



-- 
Edward J. Yoon (@eddieyoon)
Chief Executive Officer
DataSayer Co., Ltd.

Mime
View raw message