pulsar-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [pulsar] Anonymitaet commented on a change in pull request #4338: [Doc] Add Connect Pulsar to MySQL
Date Thu, 23 May 2019 06:27:31 GMT
Anonymitaet commented on a change in pull request #4338: [Doc] Add Connect Pulsar to MySQL
URL: https://github.com/apache/pulsar/pull/4338#discussion_r286792220
 
 

 ##########
 File path: site2/docs/io-quickstart.md
 ##########
 @@ -378,11 +338,415 @@ cqlsh:pulsar_test_keyspace> select * from pulsar_test_table;
   key-8 |  key-8
 ```
 
-### Delete the Cassandra Sink
+## Delete the Cassandra Sink
 
 ```shell
 bin/pulsar-admin sink delete \
     --tenant public \
     --namespace default \
     --name cassandra-test-sink
 ```
+
+# Connect Pulsar to MySQL
+
+> ### Tip
+> Make sure you have Docker available at your computer. If you don't have Docker installed,
follow the instructions [here](https://docs.docker.com/docker-for-mac/install/).
+
+## Setup a MySQL cluster
+
+Use the MySQL 5.7 docker image to start a single-node MySQL cluster in Docker.
+
+1. Pull the MySQL 5.7 image from Docker Hub.
+
+    ```
+    $ docker pull mysql:5.7
+    ```
+
+2. Start MySQL.
+
+    ```
+    $ docker run -d -it --rm \
+    --name pulsar-mysql \
+    -p 3306:3306 \
+    -e MYSQL_ROOT_PASSWORD=jdbc \
+    -e MYSQL_USER=mysqluser \
+    -e MYSQL_PASSWORD=mysqlpw \
+    mysql:5.7
+    ```
+
+    > ### Tip
+    >
+    > Flag | Description | This example
+    > - | - | -
+    > `-d` | To start a container in detached mode. | /
+    > `-it` | Keep STDIN open even if not attached and allocate a terminal. | /
+    > `--rm` | Remove the container automatically when it exits. | /
+    > `-name` | Assign a name to the container. | This example specifies _pulsar-mysql_
for the container.
+    > `-p` | Publish the port of the container to the host. | This example publishes the
port _3306_ of the container to the host.
+    > `-e` | Set environment variables. | This example sets the following variables:<br>-
The password for the root user is _jdbc_. <br>- The name for the normal user is _mysqluser_.
<br>- The password for the normal user is _mysqlpw_.
+    > For more information about Docker command, see [here](#https://docs.docker.com/engine/reference/commandline/run/).
+
+3. Check if MySQL has been started successfully.
+
+    ```
+    $ docker logs -f pulsar-mysql
+    ```
+
+    MySQL has been started successfully if the following message appears.
+
+    ```
+    2019-05-11T10:40:58.709964Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem
in data directory. Trying to enable SSL support using them.
+    2019-05-11T10:40:58.710155Z 0 [Warning] CA certificate ca.pem is self signed.
+    2019-05-11T10:40:58.711921Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
+    2019-05-11T10:40:58.711985Z 0 [Note] IPv6 is available.
+    2019-05-11T10:40:58.712695Z 0 [Note]   - '::' resolves to '::';
+    2019-05-11T10:40:58.712742Z 0 [Note] Server socket created on IP: '::'.
+    2019-05-11T10:40:58.714334Z 0 [Warning] Insecure configuration for --pid-file: Location
'/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different
directory.
+    2019-05-11T10:40:58.723802Z 0 [Note] Event Scheduler: Loaded 0 events
+    2019-05-11T10:40:58.724200Z 0 [Note] mysqld: ready for connections.
+    Version: '5.7.26'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community
Server (GPL)
+    ```
+
+4. Access to MySQL.
+
+    ```
+    $ docker exec -it pulsar-mysql /bin/bash
+    mysql -h localhost -uroot -pjdbc
+    ```
+
+5. Create a _pulsar_mysql_jdbc_sink_ table.
+
+    ```
+    $ create database pulsar_mysql_jdbc_sink;
+
+    $ use pulsar_mysql_jdbc_sink;
+
+    $ create table if not exists pulsar_mysql_jdbc_sink
+    (
+    id INT AUTO_INCREMENT,
+    name VARCHAR(255) NOT NULL,
+    primary key (id)
+    )
+    engine=innodb;
+    ```
+
+## Configure a JDBC sink
+
+Now that we have a MySQL running locally. In this section, we will configure a JDBC sink
connector. The JDBC sink connector will read messages from a Pulsar topic and write messages
into a MySQL table.
+
+1. Add a configuration file.   
+   
+    To run a JDBC sink connector, you need to prepare a yaml config file including the information
that Pulsar IO runtime needs to know. For example, how Pulsar IO can find the MySQL cluster,
what is the JDBCURL and the table that Pulsar IO will use for writing messages to.
+
+    Create a _pulsar-mysql-jdbc-sink.yaml_ file , copy the following contents to this file,
and place the file in the `pulsar/connectors` folder.
+
+    ```
+    configs:
+      userName: "root"
+      password: "jdbc"
+      jdbcUrl: "jdbc:mysql://127.0.0.1:3306/pulsar_mysql_jdbc_sink"
+      tableName: "pulsar_mysql_jdbc_sink"
+    ```
+
+2. Create a schema.
+
+    Create a _avro-schema_ file, copy the following contents to this file, and place the
file in the `pulsar/connectors` folder.
+
+    ```
+    {
+      "type": "AVRO",
+      "schema": "{\"type\":\"record\",\"name\":\"Test\",\"fields\":[{\"name\":\"id\",\"type\":[\"null\",\"int\"]},{\"name\":\"name\",\"type\":[\"null\",\"string\"]}]}",
+      "properties": {}
+    }
+    ```
+
+    > #### Tip
+    >
+    > For more information about AVRO, see [Apache Avro Documentation](https://avro.apache.org/docs/1.8.2/).
+
+
+3. Upload a schema to a topic.  
+
+    This example uploads the _avro-schema_ schema to the _pulsar-mysql-jdbc-sink-topic_ topic.
+
+    ```
+    $ bin/pulsar-admin schemas upload pulsar-mysql-jdbc-sink-topic -f ./connectors/avro-schema
+    ```
+
+4. Check if the schema has been uploaded successfully.
+
+    ```
+    $ bin/pulsar-admin schemas get pulsar-mysql-jdbc-sink-topic
+    ```
+
+    The schema has been uploaded successfully if the following message appears.
+
+    ```
+    {
+      "name" : "pulsar-mysql-jdbc-sink-topic",
+      "schema" : "eyJ0eXBlIjoicmVjb3JkIiwibmFtZSI6IlRlc3QiLCJmaWVsZHMiOlt7Im5hbWUiOiJpZCIsInR5cGUiOlsibnVsbCIsImludCJdfSx7Im5hbWUiOiJuYW1lIiwidHlwZSI6WyJudWxsIiwic3RyaW5nIl19XX0=",
+      "type" : "AVRO",
+      "properties" : { }
+    }
+    ```
+
+## Submit a JDBC sink
+
+Pulsar provides the [CLI](https://pulsar.apache.org/docs/en/pulsar-admin/) for running and
managing Pulsar I/O connectors.
+
+This example creates a sink connector and specifies the desired information.
+
+```
+$ bin/pulsar-admin sink create \
+--archive ./connectors/pulsar-io-jdbc-{{pulsar:version}}.nar \
+--inputs pulsar-mysql-jdbc-sink-topic \
+--name pulsar-mysql-jdbc-sink \
+--sink-config-file ./connectors/pulsar-mysql-jdbc-sink.yaml \
+--parallelism 1
+```
+
+Once the command is executed, Pulsar will create a sink connector named _pulsar-mysql-jdbc-sink_,
and the sink connector will be running as a Pulsar Function and write the messages produced
in the  _pulsar-mysql-jdbc-sink-topic_ topic to the MySQL _pulsar_mysql_jdbc_sink_ table.
+
+> #### Tip
+>
+> Flag | Description | This example
+> - | - | - |
+> `--archive` | Path to the archive file for the sink. | _pulsar-io-jdbc-{{pulsar:version}}.nar_

+> `--inputs` | The input topic or topics of the sink. <br> (Multiple topics can
be specified as a comma-separated list.)
+> `--name` | The name of the sink. | _pulsar-mysql-jdbc-sink_
+> `--sink-config-file` | The path to a YAML config file specifying the configuration of
the sink. | _pulsar-mysql-jdbc-sink.yaml_ 
+> `--parallelism` | The parallelism factor of the sink. <br> For example, the number
of sink instances to run. |  _1_
+> For more information about `pulsar-admin sink create options`, see [here](https://pulsar.apache.org/docs/en/pulsar-admin/#create-3).
+
+The sink has been created successfully if the following message appears.
+
+```
+"Created successfully"
+```
+
+## Inspect a JDBC sink
+
+### List all running JDBC sink(s)
+
+This example lists all running sink connectors.
+
+```
+$ bin/pulsar-admin sink list \
+--tenant public \
+--namespace default
+```
+
+The result shows that only the _mysql-jdbc-sink_ sink is running.
+
+```
+[
+ "pulsar-mysql-jdbc-sink"
+]
+```
+
+### Get information of a JDBC sink
+
+This example gets the information about the _pulsar-mysql-jdbc-sink_ sink connector.
+
+```
+$ bin/pulsar-admin sink get \
+--tenant public \
+--namespace default \
+--name pulsar-mysql-jdbc-sink
+```
+
+The result show the information of the sink connector, including tenant, namespace, topic
and so on.
+
+```
+{
+  "tenant": "public",
+  "namespace": "default",
+  "name": "pulsar-mysql-jdbc-sink",
+  "className": "org.apache.pulsar.io.jdbc.JdbcAutoSchemaSink",
+  "inputSpecs": {
+    "pulsar-mysql-jdbc-sink-topic": {
+      "isRegexPattern": false
+    }
+  },
+  "configs": {
+    "password": "jdbc",
+    "jdbcUrl": "jdbc:mysql://127.0.0.1:3306/pulsar_mysql_jdbc_sink",
+    "userName": "root",
+    "tableName": "pulsar_mysql_jdbc_sink"
+  },
+  "parallelism": 1,
+  "processingGuarantees": "ATLEAST_ONCE",
+  "retainOrdering": false,
+  "autoAck": true
+}
+```
+
+### Get status of a JDBC sink
+
+This example checks the current status of the _pulsar-mysql-jdbc-sink_ sink connector.
+
+```
+$ bin/pulsar-admin sink status \
+--tenant public \
+--namespace default \
+--name pulsar-mysql-jdbc-sink
+```
+
+The result shows the current status of sink connector, including the number of instance,
running status, worker ID and so on.
+
+```
+{
+  "numInstances" : 1,
+  "numRunning" : 1,
+  "instances" : [ {
+    "instanceId" : 0,
+    "status" : {
+      "running" : true,
+      "error" : "",
+      "numRestarts" : 0,
+      "numReadFromPulsar" : 0,
+      "numSystemExceptions" : 0,
+      "latestSystemExceptions" : [ ],
+      "numSinkExceptions" : 0,
+      "latestSinkExceptions" : [ ],
+      "numWrittenToSink" : 0,
+      "lastReceivedTime" : 0,
+      "workerId" : "c-standalone-fw-192.168.2.52-8080"
+    }
+  } ]
+}
+```
+
+## Stop a JDBC sink
+
+This example stops the _pulsar-mysql-jdbc-sink_ sink instance.
+
+```
+$ bin/pulsar-admin sink stop \
+--tenant public \
+--namespace default \
+--name pulsar-mysql-jdbc-sink \
+--instance-id 0
+```
+
+The sink instance has been stopped successfully if the following message disappears.
+
+```
+"Stopped successfully"
+```
+
+## Restart a JDBC sink
+
+This example starts the _pulsar-mysql-jdbc-sink_ sink instance.
+
+```
+$ bin/pulsar-admin sink start \
+--tenant public \
+--namespace default \
+--name pulsar-mysql-jdbc-sink \
+--instance-id 0
+```
+
+The sink instance has been started successfully if the following message disappears.
+
+```
+"Started successfully"
+```
+
+> ### Tip
+> Optionally, you can run a standalone sink connector using `pulsar-admin sink localrun
options`. 
+> 
+> Note that `pulsar-admin sink localrun options` runs a sink connector locally, while
`pulsar-admin sink start options` can run a sink connector locally or in a cluster.
 
 Review comment:
   Thanks, I've changed it to:
   
    `pulsar-admin sink start options` starts a connector in a cluster.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message