pulsar-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] sijie closed pull request #1826: [WIP] Pulsar Functions worker configuration
Date Thu, 02 Aug 2018 21:38:09 GMT
sijie closed pull request #1826: [WIP] Pulsar Functions worker configuration
URL: https://github.com/apache/incubator-pulsar/pull/1826
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/site/_data/config/functions_worker.yaml b/site/_data/config/functions_worker.yaml
new file mode 100644
index 0000000000..9da02d05b1
--- /dev/null
+++ b/site/_data/config/functions_worker.yaml
@@ -0,0 +1,82 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+configs:
+- name: workerId
+  description: An identifier for the worker
+  default: standalone
+- name: workerHostname
+  description: The hostname used by the worker daemon
+  default: localhost
+- name: workerPort
+  description: The port used by the worker daemon
+  default: 6750
+- name: functionMetadataTopicName
+  description: The Pulsar topic used for worker daemon metadata transfer
+  default: metadata
+#- name: functionMetadataSnapshotsTopicPath
+#  description: TODO
+#  default: snapshots
+- name: clusterCoordinationTopicName
+  description: The Pulsar topic used for worker daemon cluster coordination
+  default: coordinate
+- name: pulsarFunctionsNamespace
+  description: The Pulsar namespace used for Pulsar-Functions-specific functionality
+  default: public/functions
+- name: pulsarFunctionsCluster
+  default: standalone
+- name: pulsarServiceUrl
+  description: The `pulsar` scheme URL for the Pulsar broker associated with the worker daemon
+  default: pulsar://localhost:6650
+- name: pulsarWebServiceUrl
+  description: The HTTP service URL for the Pulsar broker associated with the worker daemon
+  default: http://localhost:8080
+- name: numFunctionPackageReplicas
+  description: The number of replicas of the function package (i.e. the code resources for
the function) to store
+  default: 1
+- name: downloadDirectory
+  description: The directory in which function packages are downloaded
+  default: /tmp/pulsar_functions
+- name: processContainerFactory
+  description: Add this parameter (with an optional [`logDirectory`](#logDirectory) sub-parameter)
if you'd like to use the process-based runtime for Pulsar Functions (each function instance
is run in its own process). This is the default runtime.
+- name: logDirectory
+  description: Optional sub-parameter for the process-based runtime
+- name: threadContainerFactory
+  description: Add this parameter if you'd like to use the thread-based runtime for Pulsar
Functions (each function instance is run in its own JVM thread). The process-based runtime
is the default.
+- name: schedulerClassName
+  description: The Java class name for the worker daemon scheduler implementation
+  default: org.apache.pulsar.functions.worker.scheduler.RoundRobinScheduler
+- name: functionAssignmentTopicName
+  description: The Pulsar topic used for function-assignment-related tasks
+  default: assignments
+- name: failureCheckFreqMs
+  description: The frequency with which the daemon worker checks for failure (in milliseconds)
+  default: 30000
+- name: rescheduleTimeoutMs
+  description: The timeout applied to Pulsar Function reschedule operations (in milliseconds)
+  default: 60000
+- name: initialBrokerReconnectMaxRetries
+  description: The maximum allowed number of retries when initializing broker reconnect to
the daemon worker
+  default: 60
+- name: assignmentWriteMaxRetries
+  description: The maximum allowed number of retries when attempting to assign functions
+  default: 60
+- name: instanceLivenessCheckFreqMs
+  description: The frequency with which the daemon worker checks the liveness of Pulsar Function
instances
+  default: 30000
\ No newline at end of file
diff --git a/site/docs/latest/functions/deployment.md b/site/docs/latest/functions/deployment.md
index c0871bfa79..5ac5c80878 100644
--- a/site/docs/latest/functions/deployment.md
+++ b/site/docs/latest/functions/deployment.md
@@ -208,3 +208,64 @@ Pulsar supports three different [subscription types](../../getting-started/Conce
 
 Pulsar Functions can also be assigned a subscription type when you [create](#cluster-mode)
them or run them [locally](#local-run). In cluster mode, the subscription can also be [updated](#updating)
after the function has been created.
 -->
+
+## The Pulsar Functions worker {#worker}
+
+Deployment of Pulsar Functions is handled by a dedicated worker process that runs alongside
the Pulsar {% popover broker %}. The Pulsar Functions worker is responsible for running [instances](#parallelism)
of Pulsar Functions, starting them, stopping them, etc.
+
+### Execution runtimes
+
+The Pulsar Functions worker supports two available execution runtimes:
+
+* The [process-based](#process) runtime runs Pulsar Function [instances](#parallelism) as
separate processes
+* The [thread-based](#thread) runtime runs Pulsar Function instances as separate [JVM threads](https://docs.oracle.com/javase/tutorial/essential/concurrency/procthread.html).
Please note that the thread-based runtime is available *only* for [Java](../api#java) functions.
+
+You can select the runtime when you start up a Pulsar {% popover broker %} via the broker's
[configuration](#config).
+
+{% include admonition.html type="success" title="Other runtimes" content="The process-based
and thread-based runtimes for Pulsar Functions" %}
+
+##### Process-based runtime {#process}
+
+The process-based runtime for Pulsar Functions runs function [instances](#parallelism) in
separate processes. For instructions on using the process-based runtime, see [below](#using-process).
+
+{% include admonition.html type="info" content="The processed-based runtime is the **default**
for Pulsar Functions." %}
+
+#### Thread-based runtime {#thread}
+
+The thread-based runtime for Pulsar Functions runs function [instances](#parallelism) in
separate [JVM](https://en.wikipedia.org/wiki/Java_virtual_machine) threads. For instructions
on using the thread-based runtime, see [below](#using-thread).
+
+{% include admonition.html type="warning" title="Java only" content="The thread-based runtime
can only be used with Pulsar Functions written in [Java](../api#java). If you choose the thread-based
runtime, you won't be able to run non-Java functions." %}
+
+#### Docker runtime (coming soon) {#docker}
+
+A future release of Pulsar will feature a [Docker](https://docker.com)-based runtime that
runs Pulsar Function instances in Docker containers, which facilitates using container orchestration
platforms like [Kubernetes](https://kubernetes.io).
+### Configuration {#runtime-config}
+
+The following configurable parameters are available in the [`functions_worker.yml`](../../reference/Configuration#worker)
configuration file for Pulsar {% popover brokers %}:
+
+{% include config.html id="functions_worker" %}
+
+#### Using the process-based runtime {#using-process}
+
+The process-based runtime for Pulsar Functions is the **default**. In the [`functions_worker.yaml`](#runtime-config)
configuration file, you'll see this parameter present:
+
+```yaml
+processContainerFactory:
+  logDirectory:
+```
+
+Leave the `processContainerFactor` parameter in place if you'd like to use the process-based
runtime. You can also specify a logging directory using the `logDirectory` parameter. Here's
an example configuration for the process-based runtime:
+
+```yaml
+processContainerFactory:
+  logDirectory: /path/to/logging/dir
+```
+
+#### Using the thread-based runtime {#using-thread}
+
+In order to use the thread-based runtime for Pulsar Functions you'll need to remove the `processContainerFactory`
parameter present by default in the `functions_worker.yml` [config file](#runtime-config)
and replace it with a `threadContainerFactory` parameter as well as a `threadGroupName` sub-parameter.
Here's an example:
+
+```yaml
+threadContainerFactory:
+  threadGroupName: "Thread Function Container Group"
+```
\ No newline at end of file
diff --git a/site/docs/latest/reference/Configuration.md b/site/docs/latest/reference/Configuration.md
index 7fcc58863a..d67efc8f79 100644
--- a/site/docs/latest/reference/Configuration.md
+++ b/site/docs/latest/reference/Configuration.md
@@ -30,6 +30,7 @@ Pulsar configuration can be managed either via a series of configuration
files c
 * [Client](#client)
 * [Service discovery](#service-discovery)
 * [Configuration store](#configuration-store)
+* [Pulsar Functions worker](#pulsar-functions-worker)
 * [Log4j](#log4j)
 * [Log4j shell](#log4j-shell)
 * [Standalone](#standalone)
@@ -62,6 +63,12 @@ The [`pulsar-client`](../CliTools#pulsar-client) CLI tool can be used to
publish
 
 {% include config.html id="configuration-store" %}
 
+## Pulsar Functions worker {#worker}
+
+Configuration for the [worker](../../functions/deployment#worker) process that drives [Pulsar
Functions](../../functions/overview).
+
+{% include config.html id="functions_worker" %}
+
 ## Log4j
 
 {% include config.html id="log4j" %}


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message