hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-3220) Coprocessors: Streaming distributed computation framework
Date Sat, 11 Apr 2015 01:16:13 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrew Purtell resolved HBASE-3220.
    Resolution: Later

> Coprocessors: Streaming distributed computation framework
> ---------------------------------------------------------
>                 Key: HBASE-3220
>                 URL: https://issues.apache.org/jira/browse/HBASE-3220
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Coprocessors
>            Reporter: Andrew Purtell
> Consider a computational framework based on a stream processing model. Logically: Generators
emit keys (row keys, or full keys with row+column:qualifier), fetch operators join keys to
data fetched from the region, filters drop according to (perhaps complex) matching on the
keys and/or values, combiners perform aggregation, mutators change values, decorators add
data, sinks do something useful with items arriving from the stream, i.e. insert into response
buffer, commit to region, replicate to peer. Pipelines execute in parallel. Partitioners can
split streams for mulltithreading. Generators can be observers on a region for anchoring a
continuous process or an iterator as the first stage of a pipeline constructed on demand with
a terminating condition (like a Hadoop task). Kind of like Cascading within regionserver processes,
a nice model if not literally Cascading the implementation. MapReduce can be supported with
this model, is a subset of it. Data can be ordered or unordered, depends on the generator.
Filters could be stateful or stateless: stateless filters could handle data arriving in any
order; stateful filters could be used with an ordered generator.

This message was sent by Atlassian JIRA

View raw message