manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject RE: [jira] [Commented] (CONNECTORS-1162) Apache Kafka Output Connector
Date Sun, 31 May 2015 00:56:48 GMT
Hi Tugba,

It really depends on the technology you would be using, and its
characteristics. You could model it on the hdfs output connector also,
for instance.


Sent from my Windows Phone
From: Tugba Dogan (JIRA)
Sent: 5/30/2015 4:46 PM
To: dev@manifoldcf.apache.org
Subject: [jira] [Commented] (CONNECTORS-1162) Apache Kafka Output
Connector

    [ https://issues.apache.org/jira/browse/CONNECTORS-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566190#comment-14566190
]

Tugba Dogan commented on CONNECTORS-1162:
-----------------------------------------

Hi Karl,

I want to ask about coding to you. Which connector I should to get
help from while writing code for Kafka output? I think Null connector
output can be used for starting something. What do you think about
this?

> Apache Kafka Output Connector
> -----------------------------
>
>                 Key: CONNECTORS-1162
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1162
>             Project: ManifoldCF
>          Issue Type: Wish
>    Affects Versions: ManifoldCF 1.8.1, ManifoldCF 2.0.1
>            Reporter: Rafa Haro
>            Assignee: Karl Wright
>              Labels: gsoc, gsoc2015
>             Fix For: ManifoldCF 1.10, ManifoldCF 2.2
>
>
> Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality
of a messaging system, but with a unique design. A single Kafka broker can handle hundreds
of megabytes of reads and writes per second from thousands of clients.
> Apache Kafka is being used for a number of uses cases. One of them is to use Kafka as
a feeding system for streaming BigData processes, both in Apache Spark or Hadoop environment.
A Kafka output connector could be used for streaming or dispatching crawled documents or metadata
and put them in a BigData processing pipeline



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message