Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C9A42200C05 for ; Mon, 9 Jan 2017 02:54:03 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C8171160B45; Mon, 9 Jan 2017 01:54:03 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E9390160B36 for ; Mon, 9 Jan 2017 02:54:02 +0100 (CET) Received: (qmail 78766 invoked by uid 500); 9 Jan 2017 01:54:02 -0000 Mailing-List: contact commits-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list commits@spark.apache.org Received: (qmail 78757 invoked by uid 99); 9 Jan 2017 01:54:02 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Jan 2017 01:54:02 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 11622DFA98; Mon, 9 Jan 2017 01:54:02 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: rxin@apache.org To: commits@spark.apache.org Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: spark git commit: [SPARK-19127][DOCS] Update Rank Function Documentation Date: Mon, 9 Jan 2017 01:54:02 +0000 (UTC) archived-at: Mon, 09 Jan 2017 01:54:04 -0000 Repository: spark Updated Branches: refs/heads/branch-2.1 ecc16220d -> 8690d4bd1 [SPARK-19127][DOCS] Update Rank Function Documentation ## What changes were proposed in this pull request? - [X] Fix inconsistencies in function reference for dense rank and dense - [X] Make all languages equivalent in their reference to `dense_rank` and `rank`. ## How was this patch tested? N/A for docs. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: anabranch Closes #16505 from anabranch/SPARK-19127. (cherry picked from commit 1f6ded6455d07ec8828fc9662ddffe55cbba4238) Signed-off-by: Reynold Xin Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8690d4bd Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8690d4bd Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8690d4bd Branch: refs/heads/branch-2.1 Commit: 8690d4bd150579e546aec7866b16a77bad1017f5 Parents: ecc1622 Author: anabranch Authored: Sun Jan 8 17:53:53 2017 -0800 Committer: Reynold Xin Committed: Sun Jan 8 17:53:59 2017 -0800 ---------------------------------------------------------------------- R/pkg/R/functions.R | 10 ++++++---- python/pyspark/sql/functions.py | 16 ++++++++++------ .../main/scala/org/apache/spark/sql/functions.scala | 16 ++++++++++------ 3 files changed, 26 insertions(+), 16 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/8690d4bd/R/pkg/R/functions.R ---------------------------------------------------------------------- diff --git a/R/pkg/R/functions.R b/R/pkg/R/functions.R index bf5c963..6ffa0f5 100644 --- a/R/pkg/R/functions.R +++ b/R/pkg/R/functions.R @@ -3150,7 +3150,8 @@ setMethod("cume_dist", #' The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking #' sequence when there are ties. That is, if you were ranking a competition using dense_rank #' and had three people tie for second place, you would say that all three were in second -#' place and that the next person came in third. +#' place and that the next person came in third. Rank would give me sequential numbers, making +#' the person that came in third place (after the ties) would register as coming in fifth. #' #' This is equivalent to the \code{DENSE_RANK} function in SQL. #' @@ -3321,10 +3322,11 @@ setMethod("percent_rank", #' #' Window function: returns the rank of rows within a window partition. #' -#' The difference between rank and denseRank is that denseRank leaves no gaps in ranking -#' sequence when there are ties. That is, if you were ranking a competition using denseRank +#' The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking +#' sequence when there are ties. That is, if you were ranking a competition using dense_rank #' and had three people tie for second place, you would say that all three were in second -#' place and that the next person came in third. +#' place and that the next person came in third. Rank would give me sequential numbers, making +#' the person that came in third place (after the ties) would register as coming in fifth. #' #' This is equivalent to the RANK function in SQL. #' http://git-wip-us.apache.org/repos/asf/spark/blob/8690d4bd/python/pyspark/sql/functions.py ---------------------------------------------------------------------- diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py index d8abafc..7fe901a 100644 --- a/python/pyspark/sql/functions.py +++ b/python/pyspark/sql/functions.py @@ -157,17 +157,21 @@ _window_functions = { 'dense_rank': """returns the rank of rows within a window partition, without any gaps. - The difference between rank and denseRank is that denseRank leaves no gaps in ranking - sequence when there are ties. That is, if you were ranking a competition using denseRank + The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking + sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second - place and that the next person came in third.""", + place and that the next person came in third. Rank would give me sequential numbers, making + the person that came in third place (after the ties) would register as coming in fifth. + + This is equivalent to the DENSE_RANK function in SQL.""", 'rank': """returns the rank of rows within a window partition. - The difference between rank and denseRank is that denseRank leaves no gaps in ranking - sequence when there are ties. That is, if you were ranking a competition using denseRank + The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking + sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second - place and that the next person came in third. + place and that the next person came in third. Rank would give me sequential numbers, making + the person that came in third place (after the ties) would register as coming in fifth. This is equivalent to the RANK function in SQL.""", 'cume_dist': http://git-wip-us.apache.org/repos/asf/spark/blob/8690d4bd/sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---------------------------------------------------------------------- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala index 650439a..9a080fd 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala @@ -785,10 +785,13 @@ object functions { /** * Window function: returns the rank of rows within a window partition, without any gaps. * - * The difference between rank and denseRank is that denseRank leaves no gaps in ranking - * sequence when there are ties. That is, if you were ranking a competition using denseRank + * The difference between rank and dense_rank is that denseRank leaves no gaps in ranking + * sequence when there are ties. That is, if you were ranking a competition using dense_rank * and had three people tie for second place, you would say that all three were in second - * place and that the next person came in third. + * place and that the next person came in third. Rank would give me sequential numbers, making + * the person that came in third place (after the ties) would register as coming in fifth. + * + * This is equivalent to the DENSE_RANK function in SQL. * * @group window_funcs * @since 1.6.0 @@ -929,10 +932,11 @@ object functions { /** * Window function: returns the rank of rows within a window partition. * - * The difference between rank and denseRank is that denseRank leaves no gaps in ranking - * sequence when there are ties. That is, if you were ranking a competition using denseRank + * The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking + * sequence when there are ties. That is, if you were ranking a competition using dense_rank * and had three people tie for second place, you would say that all three were in second - * place and that the next person came in third. + * place and that the next person came in third. Rank would give me sequential numbers, making + * the person that came in third place (after the ties) would register as coming in fifth. * * This is equivalent to the RANK function in SQL. * --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org For additional commands, e-mail: commits-help@spark.apache.org