manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1162) Apache Kafka Output Connector
Date Thu, 02 Jul 2015 07:04:04 GMT


Karl Wright commented on CONNECTORS-1162:

Hi [~tugbadogan],

I didn't hear back from you about whether you were ready for code review.  I presume that,
other than the unit tests, you were ready.

Based on what has been done so far, I've given you a "pass" for the midterm.  Here are my
determinations, and recommendations going forward:
(1) You seem to have developed a good working understanding of Kafka and ManifoldCF.
(2) You've produced workable code that can be integrated into MCF, with some minor editing.
Specific recommendations:
- It would suggest taking maximum advantage of me and the MCF community at large for subsequent
development.  More frequent and detailed communication would help a lot. Don't be afraid to
ask questions, post code snippets, and describe what you've tried that isn't working.  Modern
software development requires both individual initiative as well as collaboration.


> Apache Kafka Output Connector
> -----------------------------
>                 Key: CONNECTORS-1162
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Wish
>    Affects Versions: ManifoldCF 1.8.1, ManifoldCF 2.0.1
>            Reporter: Rafa Haro
>            Assignee: Karl Wright
>              Labels: gsoc, gsoc2015
>             Fix For: ManifoldCF 1.10, ManifoldCF 2.2
>         Attachments: 1.JPG, 2.JPG
> Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality
of a messaging system, but with a unique design. A single Kafka broker can handle hundreds
of megabytes of reads and writes per second from thousands of clients.
> Apache Kafka is being used for a number of uses cases. One of them is to use Kafka as
a feeding system for streaming BigData processes, both in Apache Spark or Hadoop environment.
A Kafka output connector could be used for streaming or dispatching crawled documents or metadata
and put them in a BigData processing pipeline

This message was sent by Atlassian JIRA

View raw message