camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henryk Konsek (JIRA)" <>
Subject [jira] [Created] (CAMEL-9385) Create Apache Spark component
Date Wed, 02 Dec 2015 10:32:10 GMT
Henryk Konsek created CAMEL-9385:

             Summary: Create Apache Spark component
                 Key: CAMEL-9385
             Project: Camel
          Issue Type: New Feature
            Reporter: Henryk Konsek
            Assignee: Henryk Konsek
             Fix For: 2.17.0

As a part of the the IoT project I'm working on, I have created a Spark component (1) to make
it easier to handle analytics requests from devices. I would like to donate this code to the
ASF Camel and extend it here, as I guess that there would be many people interested in using
Spark from Camel.

The URI looks like {{spark:rdd/rddName/rddCallback}} or {{spark:dataframe/frameName/frameCallback}}
depending if you would like to work with RDDs or DataFrames.

The idea here is that Camel route acts as a driver application. You specify RDD/DataFrames
definitions (and callbacks to act against those) in a registry (for example as Spring beans
or OSGi services). Then you send a parameters for the computations as a body of a message.

For example in Spring Boot you specify RDD+callback as:

JavaRDD myRdd(SparkContext sparkContext) {
  return sparkContext.textFile("foo.txt");

class MyAnalytics {

  long countLines(JavaRDD<String> textFile, long argument) {
     return rdd.count() * argument;


Then you ask for the results of computations:

long results = producerTemplate.requestBody("spark:rdd/myRdd/MyAnalytics", 10, long.class);

Such setup is extremely useful for bridging Spark computations via different transports.


This message was sent by Atlassian JIRA

View raw message