From commits-return-32296-archive-asf-public=cust-asf.ponee.io@spark.apache.org Fri Jul 6 16:53:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id CCF431807A5 for ; Fri, 6 Jul 2018 16:53:05 +0200 (CEST) Received: (qmail 12363 invoked by uid 500); 6 Jul 2018 14:52:58 -0000 Mailing-List: contact commits-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list commits@spark.apache.org Received: (qmail 10346 invoked by uid 99); 6 Jul 2018 14:52:57 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jul 2018 14:52:57 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id EC6FEE11A1; Fri, 6 Jul 2018 14:52:55 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: tgraves@apache.org To: commits@spark.apache.org Date: Fri, 06 Jul 2018 14:53:38 -0000 Message-Id: In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [45/51] [partial] spark-website git commit: Spark 2.2.2 docs http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/crossJoin.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/crossJoin.html b/site/docs/2.2.2/api/R/crossJoin.html new file mode 100644 index 0000000..b0939c9 --- /dev/null +++ b/site/docs/2.2.2/api/R/crossJoin.html @@ -0,0 +1,112 @@ +R: CrossJoin + + + + + + + + + +
crossJoin {SparkR}R Documentation
+ +

CrossJoin

+ +

Description

+ +

Returns Cartesian Product on two SparkDataFrames. +

+ + +

Usage

+ +
+## S4 method for signature 'SparkDataFrame,SparkDataFrame'
+crossJoin(x, y)
+
+ + +

Arguments

+ + + + + + +
x +

A SparkDataFrame

+
y +

A SparkDataFrame

+
+ + +

Value

+ +

A SparkDataFrame containing the result of the join operation. +

+ + +

Note

+ +

crossJoin since 2.1.0 +

+ + +

See Also

+ +

merge join +

+

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +dapplyCollect, dapply, +describe, dim, +distinct, dropDuplicates, +dropna, drop, +dtypes, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D df1 <- read.json(path)
+##D df2 <- read.json(path2)
+##D crossJoin(df1, df2) # Performs a Cartesian
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/crosstab.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/crosstab.html b/site/docs/2.2.2/api/R/crosstab.html new file mode 100644 index 0000000..c4e0894 --- /dev/null +++ b/site/docs/2.2.2/api/R/crosstab.html @@ -0,0 +1,82 @@ +R: Computes a pair-wise frequency table of the given columns + + + + + + + + + +
crosstab {SparkR}R Documentation
+ +

Computes a pair-wise frequency table of the given columns

+ +

Description

+ +

Computes a pair-wise frequency table of the given columns. Also known as a contingency +table. The number of distinct values for each column should be less than 1e4. At most 1e6 +non-zero pair frequencies will be returned. +

+ + +

Usage

+ +
+## S4 method for signature 'SparkDataFrame,character,character'
+crosstab(x, col1, col2)
+
+ + +

Arguments

+ + + + + + + + +
x +

a SparkDataFrame

+
col1 +

name of the first column. Distinct items will make the first item of each row.

+
col2 +

name of the second column. Distinct items will make the column names of the output.

+
+ + +

Value

+ +

a local R data.frame representing the contingency table. The first column of each row +will be the distinct values of col1 and the column names will be the distinct values +of col2. The name of the first column will be "col1_col2". Pairs +that have no occurrences will have zero as their counts. +

+ + +

Note

+ +

crosstab since 1.5.0 +

+ + +

See Also

+ +

Other stat functions: approxQuantile, +corr, cov, +freqItems, sampleBy +

+ + +

Examples

+ +
## Not run: 
+##D df <- read.json("/path/to/file.json")
+##D ct <- crosstab(df, "title", "gender")
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/cume_dist.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/cume_dist.html b/site/docs/2.2.2/api/R/cume_dist.html new file mode 100644 index 0000000..96afaa1 --- /dev/null +++ b/site/docs/2.2.2/api/R/cume_dist.html @@ -0,0 +1,77 @@ +R: cume_dist + + + + + + + + + +
cume_dist {SparkR}R Documentation
+ +

cume_dist

+ +

Description

+ +

Window function: returns the cumulative distribution of values within a window partition, +i.e. the fraction of rows that are below the current row. +

+ + +

Usage

+ +
+cume_dist(x = "missing")
+
+## S4 method for signature 'missing'
+cume_dist()
+
+ + +

Arguments

+ + + + +
x +

empty. Should be used with no argument.

+
+ + +

Details

+ +

N = total number of rows in the partition +cume_dist(x) = number of values before (and including) x / N +

+

This is equivalent to the CUME_DIST function in SQL. +

+ + +

Note

+ +

cume_dist since 1.6.0 +

+ + +

See Also

+ +

Other window_funcs: dense_rank, +lag, lead, +ntile, percent_rank, +rank, row_number +

+ + +

Examples

+ +
## Not run: 
+##D   df <- createDataFrame(mtcars)
+##D   ws <- orderBy(windowPartitionBy("am"), "hp")
+##D   out <- select(df, over(cume_dist(), ws), df$hp, df$am)
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/currentDatabase.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/currentDatabase.html b/site/docs/2.2.2/api/R/currentDatabase.html new file mode 100644 index 0000000..ead3563 --- /dev/null +++ b/site/docs/2.2.2/api/R/currentDatabase.html @@ -0,0 +1,50 @@ +R: Returns the current default database + + + + + + + + + +
currentDatabase {SparkR}R Documentation
+ +

Returns the current default database

+ +

Description

+ +

Returns the current default database. +

+ + +

Usage

+ +
+currentDatabase()
+
+ + +

Value

+ +

name of the current default database. +

+ + +

Note

+ +

since 2.2.0 +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D currentDatabase()
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dapply.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dapply.html b/site/docs/2.2.2/api/R/dapply.html new file mode 100644 index 0000000..13f3f9e --- /dev/null +++ b/site/docs/2.2.2/api/R/dapply.html @@ -0,0 +1,134 @@ +R: dapply + + + + + + + + + +
dapply {SparkR}R Documentation
+ +

dapply

+ +

Description

+ +

Apply a function to each partition of a SparkDataFrame. +

+ + +

Usage

+ +
+dapply(x, func, schema)
+
+## S4 method for signature 'SparkDataFrame,'function',structType'
+dapply(x, func, schema)
+
+ + +

Arguments

+ + + + + + + + +
x +

A SparkDataFrame

+
func +

A function to be applied to each partition of the SparkDataFrame. +func should have only one parameter, to which a R data.frame corresponds +to each partition will be passed. +The output of func should be a R data.frame.

+
schema +

The schema of the resulting SparkDataFrame after the function is applied. +It must match the output of func.

+
+ + +

Note

+ +

dapply since 2.0.0 +

+ + +

See Also

+ +

dapplyCollect +

+

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +crossJoin, dapplyCollect, +describe, dim, +distinct, dropDuplicates, +dropna, drop, +dtypes, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D   df <- createDataFrame(iris)
+##D   df1 <- dapply(df, function(x) { x }, schema(df))
+##D   collect(df1)
+##D 
+##D   # filter and add a column
+##D   df <- createDataFrame(
+##D           list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")),
+##D           c("a", "b", "c"))
+##D   schema <- structType(structField("a", "integer"), structField("b", "double"),
+##D                      structField("c", "string"), structField("d", "integer"))
+##D   df1 <- dapply(
+##D            df,
+##D            function(x) {
+##D              y <- x[x[1] > 1, ]
+##D              y <- cbind(y, y[1] + 1L)
+##D            },
+##D            schema)
+##D   collect(df1)
+##D   # the result
+##D   #       a b c d
+##D   #     1 2 2 2 3
+##D   #     2 3 3 3 4
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dapplyCollect.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dapplyCollect.html b/site/docs/2.2.2/api/R/dapplyCollect.html new file mode 100644 index 0000000..bb18eb9 --- /dev/null +++ b/site/docs/2.2.2/api/R/dapplyCollect.html @@ -0,0 +1,125 @@ +R: dapplyCollect + + + + + + + + + +
dapplyCollect {SparkR}R Documentation
+ +

dapplyCollect

+ +

Description

+ +

Apply a function to each partition of a SparkDataFrame and collect the result back +to R as a data.frame. +

+ + +

Usage

+ +
+dapplyCollect(x, func)
+
+## S4 method for signature 'SparkDataFrame,'function''
+dapplyCollect(x, func)
+
+ + +

Arguments

+ + + + + + +
x +

A SparkDataFrame

+
func +

A function to be applied to each partition of the SparkDataFrame. +func should have only one parameter, to which a R data.frame corresponds +to each partition will be passed. +The output of func should be a R data.frame.

+
+ + +

Note

+ +

dapplyCollect since 2.0.0 +

+ + +

See Also

+ +

dapply +

+

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +crossJoin, dapply, +describe, dim, +distinct, dropDuplicates, +dropna, drop, +dtypes, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D   df <- createDataFrame(iris)
+##D   ldf <- dapplyCollect(df, function(x) { x })
+##D 
+##D   # filter and add a column
+##D   df <- createDataFrame(
+##D           list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")),
+##D           c("a", "b", "c"))
+##D   ldf <- dapplyCollect(
+##D            df,
+##D            function(x) {
+##D              y <- x[x[1] > 1, ]
+##D              y <- cbind(y, y[1] + 1L)
+##D            })
+##D   # the result
+##D   #       a b c d
+##D   #       2 2 2 3
+##D   #       3 3 3 4
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/date_add.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/date_add.html b/site/docs/2.2.2/api/R/date_add.html new file mode 100644 index 0000000..ab1bd87 --- /dev/null +++ b/site/docs/2.2.2/api/R/date_add.html @@ -0,0 +1,76 @@ +R: date_add + + + + + + + + + +
date_add {SparkR}R Documentation
+ +

date_add

+ +

Description

+ +

Returns the date that is x days after +

+ + +

Usage

+ +
+date_add(y, x)
+
+## S4 method for signature 'Column,numeric'
+date_add(y, x)
+
+ + +

Arguments

+ + + + + + +
y +

Column to compute on

+
x +

Number of days to add

+
+ + +

Note

+ +

date_add since 1.5.0 +

+ + +

See Also

+ +

Other datetime_funcs: add_months, +date_format, date_sub, +datediff, dayofmonth, +dayofyear, from_unixtime, +from_utc_timestamp, hour, +last_day, minute, +months_between, month, +next_day, quarter, +second, to_date, +to_timestamp, +to_utc_timestamp, +unix_timestamp, weekofyear, +window, year +

+ + +

Examples

+ +
## Not run: date_add(df$d, 1)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/date_format.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/date_format.html b/site/docs/2.2.2/api/R/date_format.html new file mode 100644 index 0000000..e8e850c --- /dev/null +++ b/site/docs/2.2.2/api/R/date_format.html @@ -0,0 +1,88 @@ +R: date_format + + + + + + + + + +
date_format {SparkR}R Documentation
+ +

date_format

+ +

Description

+ +

Converts a date/timestamp/string to a value of string in the format specified by the date +format given by the second argument. +

+ + +

Usage

+ +
+date_format(y, x)
+
+## S4 method for signature 'Column,character'
+date_format(y, x)
+
+ + +

Arguments

+ + + + + + +
y +

Column to compute on.

+
x +

date format specification.

+
+ + +

Details

+ +

A pattern could be for instance

+
dd.MM.yyyy

and could return a string like '18.03.1993'. All +pattern letters of java.text.SimpleDateFormat can be used. +

+

Note: Use when ever possible specialized functions like year. These benefit from a +specialized implementation. +

+ + +

Note

+ +

date_format since 1.5.0 +

+ + +

See Also

+ +

Other datetime_funcs: add_months, +date_add, date_sub, +datediff, dayofmonth, +dayofyear, from_unixtime, +from_utc_timestamp, hour, +last_day, minute, +months_between, month, +next_day, quarter, +second, to_date, +to_timestamp, +to_utc_timestamp, +unix_timestamp, weekofyear, +window, year +

+ + +

Examples

+ +
## Not run: date_format(df$t, 'MM/dd/yyy')
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/date_sub.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/date_sub.html b/site/docs/2.2.2/api/R/date_sub.html new file mode 100644 index 0000000..d9eec03 --- /dev/null +++ b/site/docs/2.2.2/api/R/date_sub.html @@ -0,0 +1,76 @@ +R: date_sub + + + + + + + + + +
date_sub {SparkR}R Documentation
+ +

date_sub

+ +

Description

+ +

Returns the date that is x days before +

+ + +

Usage

+ +
+date_sub(y, x)
+
+## S4 method for signature 'Column,numeric'
+date_sub(y, x)
+
+ + +

Arguments

+ + + + + + +
y +

Column to compute on

+
x +

Number of days to substract

+
+ + +

Note

+ +

date_sub since 1.5.0 +

+ + +

See Also

+ +

Other datetime_funcs: add_months, +date_add, date_format, +datediff, dayofmonth, +dayofyear, from_unixtime, +from_utc_timestamp, hour, +last_day, minute, +months_between, month, +next_day, quarter, +second, to_date, +to_timestamp, +to_utc_timestamp, +unix_timestamp, weekofyear, +window, year +

+ + +

Examples

+ +
## Not run: date_sub(df$d, 1)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/datediff.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/datediff.html b/site/docs/2.2.2/api/R/datediff.html new file mode 100644 index 0000000..293ebed --- /dev/null +++ b/site/docs/2.2.2/api/R/datediff.html @@ -0,0 +1,76 @@ +R: datediff + + + + + + + + + +
datediff {SparkR}R Documentation
+ +

datediff

+ +

Description

+ +

Returns the number of days from start to end. +

+ + +

Usage

+ +
+datediff(y, x)
+
+## S4 method for signature 'Column'
+datediff(y, x)
+
+ + +

Arguments

+ + + + + + +
y +

end Column to use.

+
x +

start Column to use.

+
+ + +

Note

+ +

datediff since 1.5.0 +

+ + +

See Also

+ +

Other datetime_funcs: add_months, +date_add, date_format, +date_sub, dayofmonth, +dayofyear, from_unixtime, +from_utc_timestamp, hour, +last_day, minute, +months_between, month, +next_day, quarter, +second, to_date, +to_timestamp, +to_utc_timestamp, +unix_timestamp, weekofyear, +window, year +

+ + +

Examples

+ +
## Not run: datediff(df$c, x)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dayofmonth.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dayofmonth.html b/site/docs/2.2.2/api/R/dayofmonth.html new file mode 100644 index 0000000..7acd2de --- /dev/null +++ b/site/docs/2.2.2/api/R/dayofmonth.html @@ -0,0 +1,72 @@ +R: dayofmonth + + + + + + + + + +
dayofmonth {SparkR}R Documentation
+ +

dayofmonth

+ +

Description

+ +

Extracts the day of the month as an integer from a given date/timestamp/string. +

+ + +

Usage

+ +
+dayofmonth(x)
+
+## S4 method for signature 'Column'
+dayofmonth(x)
+
+ + +

Arguments

+ + + + +
x +

Column to compute on.

+
+ + +

Note

+ +

dayofmonth since 1.5.0 +

+ + +

See Also

+ +

Other datetime_funcs: add_months, +date_add, date_format, +date_sub, datediff, +dayofyear, from_unixtime, +from_utc_timestamp, hour, +last_day, minute, +months_between, month, +next_day, quarter, +second, to_date, +to_timestamp, +to_utc_timestamp, +unix_timestamp, weekofyear, +window, year +

+ + +

Examples

+ +
## Not run: dayofmonth(df$c)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dayofyear.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dayofyear.html b/site/docs/2.2.2/api/R/dayofyear.html new file mode 100644 index 0000000..c76f057 --- /dev/null +++ b/site/docs/2.2.2/api/R/dayofyear.html @@ -0,0 +1,72 @@ +R: dayofyear + + + + + + + + + +
dayofyear {SparkR}R Documentation
+ +

dayofyear

+ +

Description

+ +

Extracts the day of the year as an integer from a given date/timestamp/string. +

+ + +

Usage

+ +
+dayofyear(x)
+
+## S4 method for signature 'Column'
+dayofyear(x)
+
+ + +

Arguments

+ + + + +
x +

Column to compute on.

+
+ + +

Note

+ +

dayofyear since 1.5.0 +

+ + +

See Also

+ +

Other datetime_funcs: add_months, +date_add, date_format, +date_sub, datediff, +dayofmonth, from_unixtime, +from_utc_timestamp, hour, +last_day, minute, +months_between, month, +next_day, quarter, +second, to_date, +to_timestamp, +to_utc_timestamp, +unix_timestamp, weekofyear, +window, year +

+ + +

Examples

+ +
## Not run: dayofyear(df$c)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/decode.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/decode.html b/site/docs/2.2.2/api/R/decode.html new file mode 100644 index 0000000..d2fbad1 --- /dev/null +++ b/site/docs/2.2.2/api/R/decode.html @@ -0,0 +1,78 @@ +R: decode + + + + + + + + + +
decode {SparkR}R Documentation
+ +

decode

+ +

Description

+ +

Computes the first argument into a string from a binary using the provided character set +(one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). +

+ + +

Usage

+ +
+decode(x, charset)
+
+## S4 method for signature 'Column,character'
+decode(x, charset)
+
+ + +

Arguments

+ + + + + + +
x +

Column to compute on.

+
charset +

Character set to use

+
+ + +

Note

+ +

decode since 1.6.0 +

+ + +

See Also

+ +

Other string_funcs: ascii, +base64, concat_ws, +concat, encode, +format_number, format_string, +initcap, instr, +length, levenshtein, +locate, lower, +lpad, ltrim, +regexp_extract, +regexp_replace, reverse, +rpad, rtrim, +soundex, substring_index, +translate, trim, +unbase64, upper +

+ + +

Examples

+ +
## Not run: decode(df$c, "UTF-8")
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dense_rank.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dense_rank.html b/site/docs/2.2.2/api/R/dense_rank.html new file mode 100644 index 0000000..9dfbf13 --- /dev/null +++ b/site/docs/2.2.2/api/R/dense_rank.html @@ -0,0 +1,78 @@ +R: dense_rank + + + + + + + + + +
dense_rank {SparkR}R Documentation
+ +

dense_rank

+ +

Description

+ +

Window function: returns the rank of rows within a window partition, without any gaps. +The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking +sequence when there are ties. That is, if you were ranking a competition using dense_rank +and had three people tie for second place, you would say that all three were in second +place and that the next person came in third. Rank would give me sequential numbers, making +the person that came in third place (after the ties) would register as coming in fifth. +

+ + +

Usage

+ +
+dense_rank(x = "missing")
+
+## S4 method for signature 'missing'
+dense_rank()
+
+ + +

Arguments

+ + + + +
x +

empty. Should be used with no argument.

+
+ + +

Details

+ +

This is equivalent to the DENSE_RANK function in SQL. +

+ + +

Note

+ +

dense_rank since 1.6.0 +

+ + +

See Also

+ +

Other window_funcs: cume_dist, +lag, lead, +ntile, percent_rank, +rank, row_number +

+ + +

Examples

+ +
## Not run: 
+##D   df <- createDataFrame(mtcars)
+##D   ws <- orderBy(windowPartitionBy("am"), "hp")
+##D   out <- select(df, over(dense_rank(), ws), df$hp, df$am)
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dim.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dim.html b/site/docs/2.2.2/api/R/dim.html new file mode 100644 index 0000000..2a29860 --- /dev/null +++ b/site/docs/2.2.2/api/R/dim.html @@ -0,0 +1,100 @@ +R: Returns the dimensions of SparkDataFrame + + + + + + + + + +
dim {SparkR}R Documentation
+ +

Returns the dimensions of SparkDataFrame

+ +

Description

+ +

Returns the dimensions (number of rows and columns) of a SparkDataFrame +

+ + +

Usage

+ +
+## S4 method for signature 'SparkDataFrame'
+dim(x)
+
+ + +

Arguments

+ + + + +
x +

a SparkDataFrame

+
+ + +

Note

+ +

dim since 1.5.0 +

+ + +

See Also

+ +

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +crossJoin, dapplyCollect, +dapply, describe, +distinct, dropDuplicates, +dropna, drop, +dtypes, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D path <- "path/to/file.json"
+##D df <- read.json(path)
+##D dim(df)
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/distinct.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/distinct.html b/site/docs/2.2.2/api/R/distinct.html new file mode 100644 index 0000000..e7b7f1d --- /dev/null +++ b/site/docs/2.2.2/api/R/distinct.html @@ -0,0 +1,107 @@ +R: Distinct + + + + + + + + + +
distinct {SparkR}R Documentation
+ +

Distinct

+ +

Description

+ +

Return a new SparkDataFrame containing the distinct rows in this SparkDataFrame. +

+ + +

Usage

+ +
+distinct(x)
+
+## S4 method for signature 'SparkDataFrame'
+distinct(x)
+
+## S4 method for signature 'SparkDataFrame'
+unique(x)
+
+ + +

Arguments

+ + + + +
x +

A SparkDataFrame

+
+ + +

Note

+ +

distinct since 1.4.0 +

+

unique since 1.5.0 +

+ + +

See Also

+ +

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +crossJoin, dapplyCollect, +dapply, describe, +dim, dropDuplicates, +dropna, drop, +dtypes, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D path <- "path/to/file.json"
+##D df <- read.json(path)
+##D distinctDF <- distinct(df)
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/drop.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/drop.html b/site/docs/2.2.2/api/R/drop.html new file mode 100644 index 0000000..d850b9a --- /dev/null +++ b/site/docs/2.2.2/api/R/drop.html @@ -0,0 +1,122 @@ +R: drop + + + + + + + + + +
drop {SparkR}R Documentation
+ +

drop

+ +

Description

+ +

Returns a new SparkDataFrame with columns dropped. +This is a no-op if schema doesn't contain column name(s). +

+ + +

Usage

+ +
+drop(x, ...)
+
+## S4 method for signature 'SparkDataFrame'
+drop(x, col)
+
+## S4 method for signature 'ANY'
+drop(x)
+
+ + +

Arguments

+ + + + + + + + +
x +

a SparkDataFrame.

+
... +

further arguments to be passed to or from other methods.

+
col +

a character vector of column names or a Column.

+
+ + +

Value

+ +

A SparkDataFrame. +

+ + +

Note

+ +

drop since 2.0.0 +

+ + +

See Also

+ +

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +crossJoin, dapplyCollect, +dapply, describe, +dim, distinct, +dropDuplicates, dropna, +dtypes, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D path <- "path/to/file.json"
+##D df <- read.json(path)
+##D drop(df, "col1")
+##D drop(df, c("col1", "col2"))
+##D drop(df, df$col1)
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dropDuplicates.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dropDuplicates.html b/site/docs/2.2.2/api/R/dropDuplicates.html new file mode 100644 index 0000000..3741845 --- /dev/null +++ b/site/docs/2.2.2/api/R/dropDuplicates.html @@ -0,0 +1,116 @@ +R: dropDuplicates + + + + + + + + + +
dropDuplicates {SparkR}R Documentation
+ +

dropDuplicates

+ +

Description

+ +

Returns a new SparkDataFrame with duplicate rows removed, considering only +the subset of columns. +

+ + +

Usage

+ +
+dropDuplicates(x, ...)
+
+## S4 method for signature 'SparkDataFrame'
+dropDuplicates(x, ...)
+
+ + +

Arguments

+ + + + + + +
x +

A SparkDataFrame.

+
... +

A character vector of column names or string column names. +If the first argument contains a character vector, the followings are ignored.

+
+ + +

Value

+ +

A SparkDataFrame with duplicate rows removed. +

+ + +

Note

+ +

dropDuplicates since 2.0.0 +

+ + +

See Also

+ +

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +crossJoin, dapplyCollect, +dapply, describe, +dim, distinct, +dropna, drop, +dtypes, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D path <- "path/to/file.json"
+##D df <- read.json(path)
+##D dropDuplicates(df)
+##D dropDuplicates(df, "col1", "col2")
+##D dropDuplicates(df, c("col1", "col2"))
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dropTempTable-deprecated.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dropTempTable-deprecated.html b/site/docs/2.2.2/api/R/dropTempTable-deprecated.html new file mode 100644 index 0000000..e054d4c --- /dev/null +++ b/site/docs/2.2.2/api/R/dropTempTable-deprecated.html @@ -0,0 +1,64 @@ +R: (Deprecated) Drop Temporary Table + + + + + + + + + +
dropTempTable {SparkR}R Documentation
+ +

(Deprecated) Drop Temporary Table

+ +

Description

+ +

Drops the temporary table with the given table name in the catalog. +If the table has been cached/persisted before, it's also unpersisted. +

+ + +

Usage

+ +
+## Default S3 method:
+dropTempTable(tableName)
+
+ + +

Arguments

+ + + + +
tableName +

The name of the SparkSQL table to be dropped.

+
+ + +

Note

+ +

dropTempTable since 1.4.0 +

+ + +

See Also

+ +

dropTempView +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D df <- read.df(path, "parquet")
+##D createOrReplaceTempView(df, "table")
+##D dropTempTable("table")
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dropTempView.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dropTempView.html b/site/docs/2.2.2/api/R/dropTempView.html new file mode 100644 index 0000000..6af5062 --- /dev/null +++ b/site/docs/2.2.2/api/R/dropTempView.html @@ -0,0 +1,63 @@ +R: Drops the temporary view with the given view name in the... + + + + + + + + + +
dropTempView {SparkR}R Documentation
+ +

Drops the temporary view with the given view name in the catalog.

+ +

Description

+ +

Drops the temporary view with the given view name in the catalog. +If the view has been cached before, then it will also be uncached. +

+ + +

Usage

+ +
+dropTempView(viewName)
+
+ + +

Arguments

+ + + + +
viewName +

the name of the temporary view to be dropped.

+
+ + +

Value

+ +

TRUE if the view is dropped successfully, FALSE otherwise. +

+ + +

Note

+ +

since 2.0.0 +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D df <- read.df(path, "parquet")
+##D createOrReplaceTempView(df, "table")
+##D dropTempView("table")
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ http://git-wip-us.apache.org/repos/asf/spark-website/blob/e1001463/site/docs/2.2.2/api/R/dtypes.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.2/api/R/dtypes.html b/site/docs/2.2.2/api/R/dtypes.html new file mode 100644 index 0000000..fbeb199 --- /dev/null +++ b/site/docs/2.2.2/api/R/dtypes.html @@ -0,0 +1,102 @@ +R: DataTypes + + + + + + + + + +
dtypes {SparkR}R Documentation
+ +

DataTypes

+ +

Description

+ +

Return all column names and their data types as a list +

+ + +

Usage

+ +
+dtypes(x)
+
+## S4 method for signature 'SparkDataFrame'
+dtypes(x)
+
+ + +

Arguments

+ + + + +
x +

A SparkDataFrame

+
+ + +

Note

+ +

dtypes since 1.4.0 +

+ + +

See Also

+ +

Other SparkDataFrame functions: SparkDataFrame-class, +agg, arrange, +as.data.frame, +attach,SparkDataFrame-method, +cache, checkpoint, +coalesce, collect, +colnames, coltypes, +createOrReplaceTempView, +crossJoin, dapplyCollect, +dapply, describe, +dim, distinct, +dropDuplicates, dropna, +drop, except, +explain, filter, +first, gapplyCollect, +gapply, getNumPartitions, +group_by, head, +hint, histogram, +insertInto, intersect, +isLocal, isStreaming, +join, limit, +merge, mutate, +ncol, nrow, +persist, printSchema, +randomSplit, rbind, +registerTempTable, rename, +repartition, sample, +saveAsTable, schema, +selectExpr, select, +showDF, show, +storageLevel, str, +subset, take, +toJSON, union, +unpersist, withColumn, +with, write.df, +write.jdbc, write.json, +write.orc, write.parquet, +write.stream, write.text +

+ + +

Examples

+ +
## Not run: 
+##D sparkR.session()
+##D path <- "path/to/file.json"
+##D df <- read.json(path)
+##D dtypes(df)
+## End(Not run)
+
+ + +
[Package SparkR version 2.2.2 Index]
+ --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org For additional commands, e-mail: commits-help@spark.apache.org