beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From j...@apache.org
Subject [1/2] beam git commit: Clarifies BigQueryIO javadoc
Date Wed, 05 Apr 2017 21:03:25 GMT
Repository: beam
Updated Branches:
  refs/heads/master 6edf2be2d -> 0c063c29c


Clarifies BigQueryIO javadoc

Documents support for per-value tables and clarifies that it doesn't
perform well in batch mode.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5fe19ddd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5fe19ddd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5fe19ddd

Branch: refs/heads/master
Commit: 5fe19ddd1f67b228f19b5c77d59e285d921c8223
Parents: 6edf2be
Author: Reuven Lax <relax@google.com>
Authored: Mon Apr 3 19:23:09 2017 -0700
Committer: Eugene Kirpichov <kirpichov@google.com>
Committed: Wed Apr 5 13:49:46 2017 -0700

----------------------------------------------------------------------
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java    | 22 +++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam/blob/5fe19ddd/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
----------------------------------------------------------------------
diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
index 3c7b549..f5f93b3 100644
--- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
+++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
@@ -154,27 +154,35 @@ import org.slf4j.LoggerFactory;
  * <h3>Sharding BigQuery output tables</h3>
  *
  * <p>A common use case is to dynamically generate BigQuery table names based on
- * the current window. To support this,
+ * the current window or the current value. To support this,
  * {@link BigQueryIO.Write#to(SerializableFunction)}
- * accepts a function mapping the current window to a tablespec. For example,
+ * accepts a function mapping the current element to a tablespec. For example,
  * here's code that outputs daily tables to BigQuery:
  * <pre>{@code
  * PCollection<TableRow> quotes = ...
  * quotes.apply(Window.<TableRow>into(CalendarWindows.days(1)))
- *       .apply(BigQueryIO.Write
+ *       .apply(BigQueryIO.writeTableRows()
  *         .withSchema(schema)
- *         .to(new SerializableFunction<BoundedWindow, String>() {
- *           public String apply(BoundedWindow window) {
+ *         .to(new SerializableFunction<ValueInSingleWindow, String>() {
+ *           public String apply(ValueInSingleWindow value) {
  *             // The cast below is safe because CalendarWindows.days(1) produces IntervalWindows.
  *             String dayString = DateTimeFormat.forPattern("yyyy_MM_dd")
  *                  .withZone(DateTimeZone.UTC)
- *                  .print(((IntervalWindow) window).start());
+ *                  .print(((IntervalWindow) value.getWindow()).start());
  *             return "my-project:output.output_table_" + dayString;
  *           }
  *         }));
  * }</pre>
  *
- * <p>Per-window tables are not yet supported in batch mode.
+ * <p>Note that this also allows the table to be a function of the element as well
as the current
+ * pane, in the case of triggered windows. In this case it might be convenient to call
+ * {@link BigQueryIO#write()} directly instead of using the {@link BigQueryIO#writeTableRows()}
+ * helper. This will allow the mapping function to access the element of the user-defined
type.
+ * In this case, a formatting function must be specified using
+ * {@link BigQueryIO.Write#withFormatFunction} to convert each element into a {@link TableRow}
+ * object.
+ *
+ * <p>Per-value tables currently do not perform well in batch mode.
  *
  * <h3>Permissions</h3>
  *


Mime
View raw message