hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edward.y...@samsung.com>
Subject Streaming, multi-BSP job pipelines, and streaming graph&incremental ML
Date Tue, 16 Dec 2014 23:59:55 GMT

As you know, the trend already began to drift towards focusing on realtime and streaming instead
of batch. To support streaming graph and incremental learning in Hama, I recently began a
full-scale investigation about streaming data processing[1] and multi-BSP job pipelines[2].

Basically, the problem is how to process the unstructured input stream and transfer its output
stream to the next "advanced streaming analytics" job without overheads. In here, there's
also tricky issue in determining where should "new data" and "updates" be delivered. Some
uses shared memory or only supports micro-batch algorithms, but we can efficiently and directly
solve this problem by message-passing between multi jobs.

1. https://issues.apache.org/jira/browse/HAMA-883
2. https://issues.apache.org/jira/browse/HAMA-901
View raw message