gearpump-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manu Zhang <owenzhang1...@gmail.com>
Subject Re: Sink/source tickets
Date Fri, 06 May 2016 02:56:08 GMT
Hi Kam,

What I mean is like

HDFS ~ kafka-hdfs-connector ~> Kafka ~ KafkaSource ~> Gearpump ~ KafkaSink
~> Kafka ~ kafka-jdbc-connector ~ MySQL

Well, I think this will be easier to implement than wrapping Storm
connectors.

On Fri, May 6, 2016 at 10:09 AM Jiang Weihua <whjiang@outlook.com> wrote:

> From the usage I know, many cleaning applications will read from Kafka and
> write to Kafka. But, other kind apps don’t follow this pattern.
>
>
>
>
> 在 16/5/6 上午9:37,“Kam Kasravi”<kamkasravi@gmail.com> 写入:
>
> >Other benefits? Is there performance cost? Would we co-locate both our
> >source and KafkaSouce in same JVM?
> >
> >On Thursday, May 5, 2016, Jiang Weihua <whjiang@outlook.com> wrote:
> >
> >> I will say it is a good shortcut for current usage. However we
> definitely
> >> need our own source and sinks in long term.
> >>
> >> Sent from my iPhone
> >>
> >> ? 2016?5?6??06:49?Manu Zhang <owenzhang1990@gmail.com <javascript:;>
> >> <mailto:owenzhang1990@gmail.com <javascript:;>>> ???
> >>
> >> Hi Kam and others,
> >>
> >> Do you think it makes sense to utilize kafka-connect
> >> <http://docs.confluent.io/2.0.0/connect/connectors.html> for
> source/sink ?
> >> The topology would be like source ~> KafkaSource ~> DAG ~> KafkaSink
~>
> >> sink.
> >> One benefit is we always get at-least-once delivery provided by the
> current
> >> KafkaSource.
> >> Kafka provides HDFS and JDBC connector out of box and other connectors
> are
> >> being contributed by the community
> >> <
> >>
> https://github.com/search?p=1&q=kafka-connect&type=Repositories&utf8=%E2%9C%93
> >> >
> >> .
> >>
> >> On Thu, May 5, 2016 at 11:35 PM Kam Kasravi <kamkasravi@gmail.com
> >> <javascript:;><mailto:kamkasravi@gmail.com <javascript:;>>>
wrote:
> >>
> >> Hi Karol
> >>
> >> Good feedback, I'm not sure if GEARPUMP-116 would allow easy
> integration of
> >> Redis, JMS, AMQP
> >> from beam and akka-stream perspectives. Huafeng, Manu?
> >>
> >>
> >> On Wed, May 4, 2016 at 10:34 AM, Karol Brejna <karol.brejna@gmail.com
> >> <javascript:;><mailto:karol.brejna@gmail.com <javascript:;>>>
> >> wrote:
> >>
> >> We have a series of jira tickets regarding Gearpump sinks/sources:
> >>
> >> https://issues.apache.org/jira/browse/GEARPUMP-116 - Compatibility
> >> layer/adapter for Apache Storm
> >> https://issues.apache.org/jira/browse/GEARPUMP-115 - Create MQTT
> >> source/sink
> >> https://issues.apache.org/jira/browse/GEARPUMP-106 - Gearpump Redis
> >> Integration
> >> https://issues.apache.org/jira/browse/GEARPUMP-105 - Provide
> >> non-persistent
> >> Sink Task so that examples like word count can materialize Sum results
> >> within the Client
> >> https://issues.apache.org/jira/browse/GEARPUMP-100 - Source task that
> >> emits
> >> messages per a schedule (interval or otherwise) should be provided
> >> https://issues.apache.org/jira/browse/GEARPUMP-95 - Add parquet
> >> datasource
> >> and datasink connectors
> >> https://issues.apache.org/jira/browse/GEARPUMP-91 - Apache Cassandra
> >> Integration
> >>
> >> We also had a ticket for 'Add a HDFS Sink with secutiry' (
> >> https://github.com/gearpump/gearpump/issues/1547) - I am not sure as
> for
> >> the outcome of this one.
> >>
> >> Most of them consider the medium (MQTT, Redis, Casandra, ...). Other
> talk
> >> about the source mechanics (scheduled/repetative source).
> >>
> >> I'd like to discuss the order in wich we plan implementation for them.
> >>
> >> In my opinion Redis an MQTT (GEARPUMP-106, GEARPUMP-115) seems most
> >> important to have.
> >> Redis is well known and widely used. MQTT is a de facto standard in IoT
> >> communications.
> >>
> >> Then I would like to have HDFS sink (if we didn't merged this already).
> >>
> >> Non-persistent datasink could be very useful for examples/demo purposes.
> >> (Imagine we have capped collection that the application can send
> messages
> >> to, kind of application console. In the dashboard there could be a
> >> section
> >> that presents lates 'console' messages. This way a user could "watch"
> the
> >> application progress. Especially if he/she doesn't have access to the
> >> backend - as it happens often in YARN mode. But this is a topic for
> >> dedicated discussion, I think.)
> >>
> >> On the other hand, if we start working on GEARPUMP-116, we'd probably
> >> quickly have Redis, JMS, AMQP sources (adapted from Storm)
> >>
> >> Please, let me know what do you think.
> >>
> >> Karol
> >>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message