incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dibyendu Bhattacharya (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BLUR-387) Blur Spark Connector
Date Thu, 23 Oct 2014 05:05:34 GMT

     [ https://issues.apache.org/jira/browse/BLUR-387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dibyendu Bhattacharya updated BLUR-387:
---------------------------------------
    Attachment: spark-blur-connector.rar

This Spark Blur connector index Kafka Messages to Apache Blur using following Kafka Consumer.

https://github.com/dibbhatt/kafka-spark-consumer

This Fault Tolerant Kafka Consumer uses Low Level Kafka API to pull messages from Kafka Topic
Partition using Spark Custom Receiver.

For more details please refer to : https://github.com/dibbhatt/kafka-spark-consumer

Spark Blur Connector use this Kafka Consumer to index Kafka Messages using Spark Hadoop APIs.


The Kafka DStream is repartitioned into number of partitions which is same as number of Shards
of Target Blur Table.

This connector uses a Custom Spark Partitioner to map keys to correct RDD partition which
intern maps to same Blur Shard.

> Blur Spark Connector
> --------------------
>
>                 Key: BLUR-387
>                 URL: https://issues.apache.org/jira/browse/BLUR-387
>             Project: Apache Blur
>          Issue Type: New Feature
>            Reporter: Dibyendu Bhattacharya
>         Attachments: spark-blur-connector.rar
>
>
> Integrate Apache BLUR with Spark Streaming / Spark . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message