Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 449D4200C02 for ; Fri, 6 Jan 2017 00:52:03 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 3EB14160B45; Thu, 5 Jan 2017 23:52:03 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 89A6F160B33 for ; Fri, 6 Jan 2017 00:52:02 +0100 (CET) Received: (qmail 1256 invoked by uid 500); 5 Jan 2017 23:52:01 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 1241 invoked by uid 99); 5 Jan 2017 23:52:01 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Jan 2017 23:52:01 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 19BEEDF9E6; Thu, 5 Jan 2017 23:52:01 +0000 (UTC) From: tdas To: reviews@spark.apache.org Reply-To: reviews@spark.apache.org References: In-Reply-To: Subject: [GitHub] spark pull request #16468: [SPARK-19074][SS][DOCS] Updated Structured Stream... Content-Type: text/plain Message-Id: <20170105235201.19BEEDF9E6@git1-us-west.apache.org> Date: Thu, 5 Jan 2017 23:52:01 +0000 (UTC) archived-at: Thu, 05 Jan 2017 23:52:03 -0000 Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16468#discussion_r94877350 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -954,49 +1014,93 @@ There are a few types of built-in output sinks. - **File sink** - Stores the output to a directory. +{% highlight scala %} +writeStream + .format("parquet") // can be "orc", "json", "csv", etc. + .option("path", "path/to/destination/dir") + .start() +{% endhighlight %} + - **Foreach sink** - Runs arbitrary computation on the records in the output. See later in the section for more details. +{% highlight scala %} +writeStream + .foreach(...) + .start() +{% endhighlight %} + - **Console sink (for debugging)** - Prints the output to the console/stdout every time there is a trigger. Both, Append and Complete output modes, are supported. This should be used for debugging purposes on low data volumes as the entire output is collected and stored in the driver's memory after every trigger. -- **Memory sink (for debugging)** - The output is stored in memory as an in-memory table. Both, Append and Complete output modes, are supported. This should be used for debugging purposes on low data volumes as the entire output is collected and stored in the driver's memory after every trigger. +{% highlight scala %} +writeStream + .format("console") + .start() +{% endhighlight %} + +- **Memory sink (for debugging)** - The output is stored in memory as an in-memory table. +Both, Append and Complete output modes, are supported. This should be used for debugging purposes +on low data volumes as the entire output is collected and stored in the driver's memory after --- End diff -- I kind of agree it repetitive, but i dont want people to miss this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org