flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6660) expand the streaming connectors overview page
Date Tue, 23 May 2017 07:14:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020737#comment-16020737

ASF GitHub Bot commented on FLINK-6660:

Github user tzulitai commented on a diff in the pull request:

    --- Diff: docs/dev/connectors/index.md ---
    @@ -25,22 +25,54 @@ specific language governing permissions and limitations
     under the License.
    -Connectors provide code for interfacing with various third-party systems.
    +* toc
    -Currently these systems are supported: (Please select the respective documentation page
from the navigation on the left.)
    +## Predefined Sources and Sinks
    - * [Apache Kafka](https://kafka.apache.org/) (sink/source)
    - * [Elasticsearch](https://elastic.co/) (sink)
    - * [Hadoop FileSystem](http://hadoop.apache.org) (sink)
    - * [RabbitMQ](http://www.rabbitmq.com/) (sink/source)
    - * [Amazon Kinesis Streams](http://aws.amazon.com/kinesis/streams/) (sink/source)
    - * [Twitter Streaming API](https://dev.twitter.com/docs/streaming-apis) (source)
    - * [Apache NiFi](https://nifi.apache.org) (sink/source)
    - * [Apache Cassandra](https://cassandra.apache.org/) (sink)
    +A few basic data sources and sinks are built into Flink and are always available.
    +The [predefined data sources]({{ site.baseurll }}/dev/datastream_api.html#data-sources)
include reading from files, directories, and sockets, and
    +ingesting data from collections and iterators.
    +The [predefined data sinks]({{ site.baseurl }}/dev/datastream_api.html#data-sinks) support
writing to files, to stdout and stderr, and to sockets.
    +## Bundled Connectors
    +Connectors provide code for interfacing with various third-party systems. Currently these
systems are supported:
    -To run an application using one of these connectors, additional third party
    -components are usually required to be installed and launched, e.g. the servers
    -for the message queues. Further instructions for these can be found in the
    -corresponding subsections.
    + * [Apache Kafka](kafka.html) (sink/source)
    + * [Apache Cassandra](cassandra.html) (sink)
    + * [Amazon Kinesis Streams](kinesis.html) (sink/source)
    + * [Elasticsearch](elasticsearch.html) (sink)
    + * [Hadoop FileSystem](filesystem_sink.html) (sink)
    + * [RabbitMQ](rabbitmq.html) (sink/source)
    + * [Apache NiFi](nifi.html) (sink/source)
    + * [Twitter Streaming API](twitter.html) (source)
    +Keep in mind that to use one of these connectors in an application, additional third
    +components are usually required, e.g. servers for the data stores or message queues.
    +Note also that while the streaming connectors listed in this section are part of the
    +Flink project and are included in source releases, they are not included in the binary
    +Further instructions can be found in the corresponding subsections.
    +## Related Topics
    +### Data Enrichment via Async I/O
    +Streaming applications sometimes need to pull in data from external services and databases
    +in order to enrich their event streams.
    +Flink offers an API for [Asynchronous I/O]({{ site.baseurl }}/dev/stream/asyncio.html)
    +to make it easier to do this efficiently and robustly.
    +### Side Outputs
    +You can always connect an input stream to as many sinks as you like, but sometimes it
    +useful to emit additional result streams "on the side," as it were.
    +[Side Outputs]({{ site.baseurl }}/dev/stream/side_output.html) allow you to flexibily
    +split and filter your datastream in a typesafe way.
    +### Queryable State
    +Rather than always pushing data to external data stores, it is also possible for external
applications to query Flink,
    +and read from the partitioned state it manages on demand.
    +In some cases this [Queryable State]({{ site.baseurl }}/dev/stream/queryable_state.html)
interface can
    +eliminate what would otherwise be a bottleneck.
    --- End diff --
    Not sure if its just me, but it was a bit unclear to me at a first glance what "would
otherwise be a bottleneck" refers to. Perhaps move the "bottleneck" aspect directly after
"always pushing data to external data store" to make this more clear?

> expand the streaming connectors overview page 
> ----------------------------------------------
>                 Key: FLINK-6660
>                 URL: https://issues.apache.org/jira/browse/FLINK-6660
>             Project: Flink
>          Issue Type: Improvement
>          Components: Documentation, Streaming Connectors
>    Affects Versions: 1.3.0, 1.4.0
>            Reporter: David Anderson
>            Assignee: David Anderson
> The overview page for streaming connectors is too lean -- it should provide more context
and also guide the reader toward related topics.
> Note that FLINK-6038 will add links to the Bahir connectors.

This message was sent by Atlassian JIRA

View raw message