spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] [Resolved] (SPARK-2201) Improve FlumeInputDStream's stability and make it scalable
Date Thu, 11 Dec 2014 20:55:13 GMT


Sean Owen resolved SPARK-2201.
    Resolution: Won't Fix

I hope I understood this right, but the PR discussion seemed to end with suggesting that this
would not go into Spark, but maybe a contrib repo, and that it was partly already implemented
by other changes.

> Improve FlumeInputDStream's stability and make it scalable
> ----------------------------------------------------------
>                 Key: SPARK-2201
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>            Reporter: sunsc
> Currently:
> FlumeUtils.createStream(ssc, "localhost", port); 
> This means that only one flume receiver can work with FlumeInputDStream .so the solution
is not scalable. 
> I use a zookeeper to solve this problem.
> Spark flume receivers register themselves to a zk path when started, and a flume agent
get physical hosts and push events to them.
> Some works need to be done here: 
> 1.receiver create tmp node in zk,  listeners just watch those tmp nodes.
> 2. when spark FlumeReceivers started, they acquire a physical host (localhost's ip and
an idle port) and register itself to zookeeper.
> 3. A new flume sink. In the method of appendEvents, they get physical hosts and push
data to them in a round-robin manner.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message