From dev-return-24430-archive-asf-public=cust-asf.ponee.io@spark.apache.org  Mon Apr  9 14:06:47 2018
Return-Path: <dev-return-24430-archive-asf-public=cust-asf.ponee.io@spark.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id C20A0180645
	for <archive-asf-public@cust-asf.ponee.io>; Mon,  9 Apr 2018 14:06:45 +0200 (CEST)
Received: (qmail 16710 invoked by uid 500); 9 Apr 2018 12:06:44 -0000
Mailing-List: contact dev-help@spark.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:dev-help@spark.apache.org>
List-Unsubscribe: <mailto:dev-unsubscribe@spark.apache.org>
List-Post: <mailto:dev@spark.apache.org>
List-Id: <dev.spark.apache.org>
Delivered-To: mailing list dev@spark.apache.org
Received: (qmail 16695 invoked by uid 99); 9 Apr 2018 12:06:43 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Apr 2018 12:06:43 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D0D12C0147
	for <dev@spark.apache.org>; Mon,  9 Apr 2018 12:06:42 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 1.879
X-Spam-Level: *
X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31
	tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
	HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01,
	RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled
Authentication-Results: spamd4-us-west.apache.org (amavisd-new);
	dkim=pass (2048-bit key) header.d=gmail.com
Received: from mx1-lw-eu.apache.org ([10.40.0.8])
	by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024)
	with ESMTP id Kr3bXiZ6rgbV for <dev@spark.apache.org>;
	Mon,  9 Apr 2018 12:06:39 +0000 (UTC)
Received: from mail-qk0-f173.google.com (mail-qk0-f173.google.com [209.85.220.173])
	by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C1E225F51F
	for <dev@spark.apache.org>; Mon,  9 Apr 2018 12:06:38 +0000 (UTC)
Received: by mail-qk0-f173.google.com with SMTP id j73so8905619qke.6
        for <dev@spark.apache.org>; Mon, 09 Apr 2018 05:06:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :cc;
        bh=FMdpD95n3+YQZQZoQ1jEHPDdQOGHaChpK7z8/cc8nRM=;
        b=K4ourgXeVeBseMKia/CQmjUdfn6Lx0FjWMdwbBy1mroMyz78MAXrsQ8a3jE/jxS+sa
         8S0SoQmLMboSGqh+4wIjbXVnchxtYuu2OY3EGmwLLQLrdqlF8IG4vJrg1ZyUGsEIV+pH
         AIzxEKRS9alk8AmuiD/u7RV/cOjL6l2RXQBmVtUnO1unuCemVIPQSZ+iI5zx2ZzhtqGA
         hNjOwiqiUWGrwQAbTCGRUQ5z3VHrpsquNuKCpGFAkIENi0wVLCUlFHo4k7OiVdun3P02
         TL1WjH9LgTfhcMuXWKVM1nQyfpQs2E4q/1y7hZLwwfyIVgT/r0PjuUZFA0szFcEr6NgY
         OusA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:in-reply-to:references:from:date
         :message-id:subject:to:cc;
        bh=FMdpD95n3+YQZQZoQ1jEHPDdQOGHaChpK7z8/cc8nRM=;
        b=l2/S1GjKof2AmX5jtPDnTZc0kNJRIPMOls/qGyafmXA/1s+4J1pbDooSwAWyzs+JJL
         q2IW41uo1reW9cWYuQKGjdD1pihadsXwvvenKnjuTKZd4G+blMN1Ixm3qxAER2+0UvBd
         tpnyyXETjAO3P92U2FlWb7phmYyxOh7/BSUM/Qp1EeitRg2X3Yvd6PS6t8mbP7ICUnM8
         GwIHmW0fWg4vjvPcyizDcY/ctRx7mA+FLYjCREczxGJMGamyA45xVJ8SqOjc0QVuk1Si
         tB8lDFjnWTUaWvXmKLLauNOsoRSRB57ss+Tjsh+9GKPzYw7SQvcOOICneczZB6GvhYuL
         omEg==
X-Gm-Message-State: ALQs6tDv7178cU00R4tLhGMFI3QqrjP7w5n8FO0JWjTp07B/t1P74oaL
	Zc3tP/z4g1bWsNCBJvLpU/KoJ2qpQkIFAhw/VgQ=
X-Google-Smtp-Source: AIpwx4+V1liwt05frsuFBYlvZ+/EbHALrtUTAnYOhS/vGz14F10OUSb8nlcYKTPaP1llMn5WE216q8CrkF+6khDhPNA=
X-Received: by 10.55.165.23 with SMTP id o23mr48985411qke.163.1523275597580;
 Mon, 09 Apr 2018 05:06:37 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.200.36.186 with HTTP; Mon, 9 Apr 2018 05:06:17 -0700 (PDT)
In-Reply-To: <CAPh_B=ZF3WbuQuvtcNuw-fFC0rDczrT-MVzMiwEqXM_hrQVHbQ@mail.gmail.com>
References: <CAGY9duUT_RMF7ufoBf-AZX5AQ2SzLf6jPdjAmRDiaTndC+zZ0A@mail.gmail.com>
 <CAPh_B=Z6m=cF6MS5BtK9iLcmZ2E290KqNrFKZb8F9be4z_U7CA@mail.gmail.com>
 <CABfDjEAAYEkhr5AHsbHyhcyYi=7eenNcEGtp2_Vo5Oh-d-6j6w@mail.gmail.com>
 <CAPh_B=b-g8T9h7BmhZnq73c2_xJpX8=cOz5gA-pbf_-D3nwQ5w@mail.gmail.com>
 <CAA2nO84pPMavRY6pK7W+XYmFTyfm48z9+aJagsGpKr1ZyqLWsg@mail.gmail.com>
 <CAPh_B=ZB1d3QVOgohd5DZe-Ur5Hq9eUtQ9=WmmusDtYCC44vFg@mail.gmail.com>
 <CAGY9duXsNDDjJEHz1eoFZfntiZMEp3rhO3VKpbMQOqLBRh4p8A@mail.gmail.com> <CAPh_B=ZF3WbuQuvtcNuw-fFC0rDczrT-MVzMiwEqXM_hrQVHbQ@mail.gmail.com>
From: Sandor Murakozi <smurakozi@gmail.com>
Date: Mon, 9 Apr 2018 14:06:17 +0200
Message-ID: <CANYxS_de9WMs5M4KG5dH0dV+PwK+XiwiptCfG5hMsACNQc7-vw@mail.gmail.com>
Subject: Re: Clarify window behavior in Spark SQL
To: Reynold Xin <rxin@databricks.com>
Cc: Li Jin <ice.xelloss@gmail.com>, 
	=?UTF-8?Q?Herman_van_H=C3=B6vell_tot_Westerflier?= <hvanhovell@databricks.com>, 
	=?UTF-8?Q?Herman_van_H=C3=B6vell_tot_Westerflier?= <herman@databricks.com>, 
	Xingbo Jiang <jiangxb1987@gmail.com>, dev <dev@spark.apache.org>
Content-Type: multipart/alternative; boundary="94eb2c0651e0e38ff50569693cfd"

--94eb2c0651e0e38ff50569693cfd
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Li,
You might find my pending PR useful:
https://github.com/apache/spark/pull/20045/files

It contains a big bunch of test cases covering the windowing functionality,
showing and checking the behavior of a number of special cases.

On Wed, Apr 4, 2018 at 4:26 AM, Reynold Xin <rxin@databricks.com> wrote:

> Thanks Li!
>
> On Tue, Apr 3, 2018 at 7:23 PM Li Jin <ice.xelloss@gmail.com> wrote:
>
>> Thanks all for the explanation. I am happy to update the API doc.
>>
>> https://issues.apache.org/jira/browse/SPARK-23861
>>
>> On Tue, Apr 3, 2018 at 8:54 PM, Reynold Xin <rxin@databricks.com> wrote:
>>
>>> Ah ok. Thanks for commenting. Everyday I learn something new about SQL.
>>>
>>> For others to follow, SQL Server has a good explanation of the behavior=
:
>>> https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-
>>> transact-sql
>>>
>>>
>>> Can somebody (Li?) update the API documentation to specify the gotchas,
>>> in case users are not familiar with SQL window function semantics?
>>>
>>>
>>>
>>> General Remarks
>>> <https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-=
transact-sql#general-remarks>
>>>
>>> More than one window function can be used in a single query with a
>>> single FROM clause. The OVER clause for each function can differ in
>>> partitioning and ordering.
>>>
>>> If PARTITION BY is not specified, the function treats all rows of the
>>> query result set as a single group.
>>> Important!
>>> <https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-=
transact-sql#important>
>>>
>>> If ROWS/RANGE is specified and <window frame preceding> is used for
>>> <window frame extent> (short syntax) then this specification is used fo=
r
>>> the window frame boundary starting point and CURRENT ROW is used for th=
e
>>> boundary ending point. For example =E2=80=9CROWS 5 PRECEDING=E2=80=9D i=
s equal to =E2=80=9CROWS
>>> BETWEEN 5 PRECEDING AND CURRENT ROW=E2=80=9D.
>>>
>>> Note+
>>>
>>> If ORDER BY is not specified entire partition is used for a window
>>> frame. This applies only to functions that do not require ORDER BY clau=
se.
>>> If ROWS/RANGE is not specified but ORDER BY is specified, RANGE UNBOUND=
ED
>>> PRECEDING AND CURRENT ROW is used as default for window frame. This app=
lies
>>> only to functions that have can accept optional ROWS/RANGE specificatio=
n.
>>> For example, ranking functions cannot accept ROWS/RANGE, therefore this
>>> window frame is not applied even though ORDER BY is present and ROWS/RA=
NGE
>>> is not.
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Apr 3, 2018 at 5:50 PM, Xingbo Jiang <jiangxb1987@gmail.com>
>>> wrote:
>>>
>>>> This is actually by design, without a `ORDER BY` clause, all rows are
>>>> considered as the peer row of the current row, which means that the fr=
ame
>>>> is effectively the entire partition. This behavior follows the window
>>>> syntax of PGSQL.
>>>> You can refer to the comment by yhuai: https://github.com/
>>>> apache/spark/pull/5604#discussion_r157931911
>>>> :)
>>>>
>>>> 2018-04-04 6:27 GMT+08:00 Reynold Xin <rxin@databricks.com>:
>>>>
>>>>> Do other (non-Hive) SQL systems do the same thing?
>>>>>
>>>>> On Tue, Apr 3, 2018 at 3:16 PM, Herman van H=C3=B6vell tot Westerflie=
r <
>>>>> herman@databricks.com> wrote:
>>>>>
>>>>>> This is something we inherited from Hive: https://cwiki.apache.
>>>>>> org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
>>>>>>
>>>>>> When ORDER BY is specified with missing WINDOW clause, the WINDOW
>>>>>>> specification defaults to RANGE BETWEEN UNBOUNDED PRECEDING AND
>>>>>>> CURRENT ROW.
>>>>>>
>>>>>> When both ORDER BY and WINDOW clauses are missing, the WINDOW
>>>>>>> specification defaults to ROW BETWEEN UNBOUNDED PRECEDING AND
>>>>>>> UNBOUNDED FOLLOWING.
>>>>>>
>>>>>>
>>>>>> It sort of makes sense if you think about it. If there is no orderin=
g
>>>>>> there is no way to have a bound frame. If there is ordering we defau=
lt to
>>>>>> the most commonly used deterministic frame.
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 3, 2018 at 11:09 PM, Reynold Xin <rxin@databricks.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Seems like a bug.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 3, 2018 at 1:26 PM, Li Jin <ice.xelloss@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Devs,
>>>>>>>>
>>>>>>>> I am seeing some behavior with window functions that is a bit
>>>>>>>> unintuitive and would like to get some clarification.
>>>>>>>>
>>>>>>>> When using aggregation function with window, the frame boundary
>>>>>>>> seems to change depending on the order of the window.
>>>>>>>>
>>>>>>>> Example:
>>>>>>>> (1)
>>>>>>>>
>>>>>>>> df =3D spark.createDataFrame([[0, 1], [0, 2], [0, 3]]).toDF('id', =
'v')
>>>>>>>>
>>>>>>>> w1 =3D Window.partitionBy('id')
>>>>>>>>
>>>>>>>> df.withColumn('v2', mean(df.v).over(w1)).show()
>>>>>>>>
>>>>>>>> +---+---+---+
>>>>>>>>
>>>>>>>> | id|  v| v2|
>>>>>>>>
>>>>>>>> +---+---+---+
>>>>>>>>
>>>>>>>> |  0|  1|2.0|
>>>>>>>>
>>>>>>>> |  0|  2|2.0|
>>>>>>>>
>>>>>>>> |  0|  3|2.0|
>>>>>>>>
>>>>>>>> +---+---+---+
>>>>>>>>
>>>>>>>> (2)
>>>>>>>> df =3D spark.createDataFrame([[0, 1], [0, 2], [0, 3]]).toDF('id', =
'v')
>>>>>>>>
>>>>>>>> w2 =3D Window.partitionBy('id').orderBy('v')
>>>>>>>>
>>>>>>>> df.withColumn('v2', mean(df.v).over(w2)).show()
>>>>>>>>
>>>>>>>> +---+---+---+
>>>>>>>>
>>>>>>>> | id|  v| v2|
>>>>>>>>
>>>>>>>> +---+---+---+
>>>>>>>>
>>>>>>>> |  0|  1|1.0|
>>>>>>>>
>>>>>>>> |  0|  2|1.5|
>>>>>>>>
>>>>>>>> |  0|  3|2.0|
>>>>>>>>
>>>>>>>> +---+---+---+
>>>>>>>>
>>>>>>>> Seems like orderBy('v') in the example (2) also changes the frame
>>>>>>>> boundaries from (
>>>>>>>>
>>>>>>>> unboundedPreceding, unboundedFollowing) to (unboundedPreceding,
>>>>>>>> currentRow).
>>>>>>>>
>>>>>>>>
>>>>>>>> I found this behavior a bit unintuitive. I wonder if this behavior
>>>>>>>> is by design and if so, what's the specific rule that orderBy() in=
teracts
>>>>>>>> with frame boundaries?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Li
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>

--94eb2c0651e0e38ff50569693cfd
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hi Li,</div><div>You might find my pending PR useful:=
</div><a href=3D"https://github.com/apache/spark/pull/20045/files">https://=
github.com/apache/spark/pull/20045/files</a><div><br><div><span style=3D"co=
lor:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:40=
0;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:non=
e;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);tex=
t-decoration-style:initial;text-decoration-color:initial;float:none;display=
:inline">It contains a big bunch of test cases covering the windowing funct=
ionality, showing and<span>=C2=A0</span></span><span class=3D"gmail-gr_ gma=
il-gr_387 gmail-gr-alert gmail-gr_spell gmail-gr_inline_cards gmail-gr_disa=
ble_anim_appear gmail-ContextualSpelling gmail-ins-del gmail-multiReplace" =
id=3D"gmail-387" style=3D"display:inline;color:rgb(34,34,34);font-size:smal=
l;border-bottom:2px solid transparent;background-repeat:no-repeat;font-fami=
ly:arial,sans-serif;font-style:normal;font-variant-ligatures:normal;font-va=
riant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;te=
xt-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;backg=
round-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-=
color:initial">checking the</span><span style=3D"color:rgb(34,34,34);font-f=
amily:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;=
text-align:start;text-indent:0px;text-transform:none;white-space:normal;wor=
d-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initi=
al;text-decoration-color:initial;float:none;display:inline"><span>=C2=A0</s=
pan>behavior of a number of special cases.</span><br></div></div></div><div=
 class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, Apr 4, 2018 a=
t 4:26 AM, Reynold Xin <span dir=3D"ltr">&lt;<a href=3D"mailto:rxin@databri=
cks.com" target=3D"_blank">rxin@databricks.com</a>&gt;</span> wrote:<br><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #=
ccc solid;padding-left:1ex"><div><div dir=3D"auto">Thanks Li!</div><div><di=
v class=3D"h5"><br><div class=3D"gmail_quote"><div>On Tue, Apr 3, 2018 at 7=
:23 PM Li Jin &lt;<a href=3D"mailto:ice.xelloss@gmail.com" target=3D"_blank=
">ice.xelloss@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1=
ex"><div>Thanks all for the explanation. I am happy to update the API doc.<=
div><br></div><div><a href=3D"https://issues.apache.org/jira/browse/SPARK-2=
3861" target=3D"_blank">https://issues.apache.org/<wbr>jira/browse/SPARK-23=
861</a><br></div></div><div class=3D"gmail_extra"><br><div class=3D"gmail_q=
uote">On Tue, Apr 3, 2018 at 8:54 PM, Reynold Xin <span>&lt;<a href=3D"mail=
to:rxin@databricks.com" target=3D"_blank">rxin@databricks.com</a>&gt;</span=
> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex"><div>Ah ok. Thanks for commentin=
g. Everyday I learn something new about SQL.<div><br></div><div>For others =
to follow, SQL Server has a good explanation of the behavior:=C2=A0<a href=
=3D"https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-t=
ransact-sql" target=3D"_blank">https://docs.<wbr>microsoft.com/en-us/sql/t-=
sql/<wbr>queries/select-over-clause-<wbr>transact-sql</a></div><div><br></d=
iv><div><br></div><div>Can somebody (Li?) update the API documentation to s=
pecify the gotchas, in case users are not familiar with SQL window function=
 semantics?</div><div><br></div><div><br></div><div><br></div><div><h2 id=
=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192438m_-27=
20518577023111020gmail-general-remarks" class=3D"m_-4175560659618291644m_-1=
068056827686289989m_-2466685597058192438m_-2720518577023111020gmail-heading=
-with-anchor" style=3D"font-style:normal;font-weight:300;word-wrap:break-wo=
rd;line-height:1.3;margin-bottom:12px;margin-top:32px;font-size:1.75rem;col=
or:rgb(0,0,0);font-family:segoe-ui_normal,&quot;Segoe UI&quot;,Segoe,&quot;=
Segoe WP&quot;,&quot;Helvetica Neue&quot;,Helvetica,sans-serif;font-variant=
-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align=
:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:=
0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-de=
coration-color:initial">General Remarks<a class=3D"m_-4175560659618291644m_=
-1068056827686289989m_-2466685597058192438m_-2720518577023111020gmail-docon=
 m_-4175560659618291644m_-1068056827686289989m_-2466685597058192438m_-27205=
18577023111020gmail-docon-link m_-4175560659618291644m_-1068056827686289989=
m_-2466685597058192438m_-2720518577023111020gmail-heading-anchor" href=3D"h=
ttps://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transa=
ct-sql#general-remarks" style=3D"background-color:transparent;color:rgb(0,1=
20,215);text-decoration:none;word-wrap:break-word;font-family:docons;font-s=
ize:0.8em;speak:none;display:inline-block;font-style:normal;font-weight:400=
;font-variant:normal;text-transform:none;text-align:center;line-height:16px=
;opacity:0;margin:0px 0px 4px 10px;vertical-align:middle" target=3D"_blank"=
></a></h2><p class=3D"m_-4175560659618291644m_-1068056827686289989m_-246668=
5597058192438m_-2720518577023111020gmail-lf-text-block m_-41755606596182916=
44m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020gmail-l=
f-block" style=3D"margin-top:1rem;margin-bottom:0px;color:rgb(0,0,0);font-f=
amily:segoe-ui_normal,&quot;Segoe UI&quot;,Segoe,&quot;Segoe WP&quot;,&quot=
;Helvetica Neue&quot;,Helvetica,sans-serif;font-size:16px;font-style:normal=
;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;let=
ter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;whi=
te-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-dec=
oration-style:initial;text-decoration-color:initial;padding-right:25px">Mor=
e than one window function can be used in a single query with a single FROM=
 clause. The OVER clause for each function can differ in partitioning and o=
rdering.<span class=3D"m_-4175560659618291644m_-1068056827686289989m_-24666=
85597058192438m_-2720518577023111020gmail-lf-thread-btn"></span></p><p clas=
s=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192438m_-2=
720518577023111020gmail-lf-text-block m_-4175560659618291644m_-106805682768=
6289989m_-2466685597058192438m_-2720518577023111020gmail-lf-block" style=3D=
"margin-top:1rem;margin-bottom:0px;color:rgb(0,0,0);font-family:segoe-ui_no=
rmal,&quot;Segoe UI&quot;,Segoe,&quot;Segoe WP&quot;,&quot;Helvetica Neue&q=
uot;,Helvetica,sans-serif;font-size:16px;font-style:normal;font-variant-lig=
atures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:norma=
l;text-align:start;text-indent:0px;text-transform:none;white-space:normal;w=
ord-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:ini=
tial;text-decoration-color:initial;padding-right:25px">If PARTITION BY is n=
ot specified, the function treats all rows of the query result set as a sin=
gle group.<span class=3D"m_-4175560659618291644m_-1068056827686289989m_-246=
6685597058192438m_-2720518577023111020gmail-lf-thread-btn"></span></p><h3 i=
d=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192438m_-2=
720518577023111020gmail-important" class=3D"m_-4175560659618291644m_-106805=
6827686289989m_-2466685597058192438m_-2720518577023111020gmail-heading-with=
-anchor" style=3D"font-style:normal;font-weight:300;word-wrap:break-word;fo=
nt-family:segoe-ui_semibold,&quot;Segoe UI Semibold&quot;,&quot;Segoe WP&qu=
ot;,&quot;Helvetica Neue&quot;,Helvetica,sans-serif;line-height:1.3;margin-=
bottom:18px;margin-top:30px;font-size:1.188rem;color:rgb(0,0,0);font-varian=
t-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-alig=
n:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing=
:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-d=
ecoration-color:initial">Important!<a class=3D"m_-4175560659618291644m_-106=
8056827686289989m_-2466685597058192438m_-2720518577023111020gmail-docon m_-=
4175560659618291644m_-1068056827686289989m_-2466685597058192438m_-272051857=
7023111020gmail-docon-link m_-4175560659618291644m_-1068056827686289989m_-2=
466685597058192438m_-2720518577023111020gmail-heading-anchor" href=3D"https=
://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-s=
ql#important" style=3D"background-color:transparent;color:rgb(0,120,215);te=
xt-decoration:none;word-wrap:break-word;font-family:docons;font-size:0.9em;=
speak:none;display:inline-block;font-style:normal;font-weight:400;font-vari=
ant:normal;text-transform:none;text-align:center;line-height:16px;opacity:0=
;margin:0px 0px 0px 10px;vertical-align:middle" target=3D"_blank"></a></h3>=
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020gmail-lf-text-block m_-4175560659618291644m_-10680=
56827686289989m_-2466685597058192438m_-2720518577023111020gmail-lf-block" s=
tyle=3D"margin-top:1rem;margin-bottom:0px;color:rgb(0,0,0);font-family:sego=
e-ui_normal,&quot;Segoe UI&quot;,Segoe,&quot;Segoe WP&quot;,&quot;Helvetica=
 Neue&quot;,Helvetica,sans-serif;font-size:16px;font-style:normal;font-vari=
ant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacin=
g:normal;text-align:start;text-indent:0px;text-transform:none;white-space:n=
ormal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-st=
yle:initial;text-decoration-color:initial;padding-right:25px">If ROWS/RANGE=
 is specified and &lt;window frame preceding&gt; is used for &lt;window fra=
me extent&gt; (short syntax) then this specification is used for the window=
 frame boundary starting point and CURRENT ROW is used for the boundary end=
ing point. For example =E2=80=9CROWS 5 PRECEDING=E2=80=9D is equal to =E2=
=80=9CROWS BETWEEN 5 PRECEDING AND CURRENT ROW=E2=80=9D.<span class=3D"m_-4=
175560659618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577=
023111020gmail-lf-thread-btn"></span></p><div class=3D"m_-41755606596182916=
44m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020gmail-N=
OTE m_-4175560659618291644m_-1068056827686289989m_-2466685597058192438m_-27=
20518577023111020gmail-alert" style=3D"display:block;border-radius:6px;box-=
sizing:border-box;font-size:16px;padding:16px;margin-top:16px;background-co=
lor:rgb(217,246,255);border-color:rgb(191,241,255);color:rgb(0,0,0);font-fa=
mily:segoe-ui_normal,&quot;Segoe UI&quot;,Segoe,&quot;Segoe WP&quot;,&quot;=
Helvetica Neue&quot;,Helvetica,sans-serif;font-style:normal;font-variant-li=
gatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:norm=
al;text-align:start;text-indent:0px;text-transform:none;white-space:normal;=
word-spacing:0px;text-decoration-style:initial;text-decoration-color:initia=
l"><p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058=
192438m_-2720518577023111020gmail-lf-text-block m_-4175560659618291644m_-10=
68056827686289989m_-2466685597058192438m_-2720518577023111020gmail-lf-block=
" style=3D"margin:0px;font-family:segoe-ui_semibold,&quot;Segoe UI Semibold=
&quot;,&quot;Segoe WP&quot;,&quot;Helvetica Neue&quot;,Helvetica,sans-serif=
;font-size:1rem;width:790.125px;color:rgb(0,109,140);padding-right:25px">No=
te<span class=3D"m_-4175560659618291644m_-1068056827686289989m_-24666855970=
58192438m_-2720518577023111020gmail-lf-thread-btn"><a class=3D"m_-417556065=
9618291644m_-1068056827686289989m_-2466685597058192438m_-272051857702311102=
0gmail-fycon-action-view" style=3D"background:0px 0px;color:rgb(0,109,140);=
text-decoration:none;word-wrap:break-word;font-family:&quot;Helvetica Neue&=
quot;,Arial,Helvetica,Geneva,sans-serif;border:none;display:inline-block;fo=
nt-style:normal;font-variant:normal;font-weight:500;font-stretch:normal;fon=
t-size:11px;line-height:13px;height:13px;margin:0px 0px 0px 6px;opacity:1">=
+</a></span></p><p class=3D"m_-4175560659618291644m_-1068056827686289989m_-=
2466685597058192438m_-2720518577023111020gmail-lf-text-block m_-41755606596=
18291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020g=
mail-lf-block" style=3D"margin:8px 0px 0px;font-family:segoe-ui_normal,&quo=
t;Segoe UI&quot;,Segoe,&quot;Segoe WP&quot;,&quot;Helvetica Neue&quot;,Helv=
etica,sans-serif;padding-right:25px">If ORDER BY is not specified entire pa=
rtition is used for a window frame. This applies only to functions that do =
not require ORDER BY clause. If ROWS/RANGE is not specified but ORDER BY is=
 specified, RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as default fo=
r window frame. This applies only to functions that have can accept optiona=
l ROWS/RANGE specification. For example, ranking functions cannot accept RO=
WS/RANGE, therefore this window frame is not applied even though ORDER BY i=
s present and ROWS/RANGE is not.<span class=3D"m_-4175560659618291644m_-106=
8056827686289989m_-2466685597058192438m_-2720518577023111020gmail-lf-thread=
-btn"></span></p></div><br class=3D"m_-4175560659618291644m_-10680568276862=
89989m_-2466685597058192438m_-2720518577023111020gmail-Apple-interchange-ne=
wline"><br><div><br></div><div><br></div></div><div><div class=3D"m_-417556=
0659618291644m_-1068056827686289989h5"><div class=3D"gmail_extra"><br><div =
class=3D"gmail_quote">On Tue, Apr 3, 2018 at 5:50 PM, Xingbo Jiang <span>&l=
t;<a href=3D"mailto:jiangxb1987@gmail.com" target=3D"_blank">jiangxb1987@gm=
ail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D=
"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>This i=
s actually by design, without a `ORDER BY` clause, all rows are considered =
as the peer row of the current row, which means that the frame is effective=
ly the entire partition. This behavior follows the window syntax of PGSQL.<=
div>You can refer to the comment by yhuai:=C2=A0<a href=3D"https://github.c=
om/apache/spark/pull/5604#discussion_r157931911" target=3D"_blank">https://=
github.com/<wbr>apache/spark/pull/5604#<wbr>discussion_r157931911</a></div>=
<div>:)</div></div><div class=3D"m_-4175560659618291644m_-10680568276862899=
89m_-2466685597058192438m_-2720518577023111020HOEnZb"><div class=3D"m_-4175=
560659618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023=
111020h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">2018-04=
-04 6:27 GMT+08:00 Reynold Xin <span>&lt;<a href=3D"mailto:rxin@databricks.=
com" target=3D"_blank">rxin@databricks.com</a>&gt;</span>:<br><blockquote c=
lass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;=
padding-left:1ex"><div><div>Do other (non-Hive) SQL systems do the same thi=
ng?=C2=A0</div></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quo=
te">On Tue, Apr 3, 2018 at 3:16 PM, Herman van H=C3=B6vell tot Westerflier =
<span>&lt;<a href=3D"mailto:herman@databricks.com" target=3D"_blank">herman=
@databricks.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><di=
v><div style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:=
small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:nor=
mal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;=
text-transform:none;white-space:normal;word-spacing:0px;text-decoration-sty=
le:initial;text-decoration-color:initial"><span style=3D"color:rgb(34,34,34=
);font-family:arial,sans-serif;font-size:small;font-style:normal;font-varia=
nt-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing=
:normal;text-align:start;text-indent:0px;text-transform:none;white-space:no=
rmal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-sty=
le:initial;text-decoration-color:initial;float:none;display:inline">This is=
 something we inherited from Hive:=C2=A0<a href=3D"https://cwiki.apache.org=
/confluence/display/Hive/LanguageManual+WindowingAndAnalytics" target=3D"_b=
lank">https://cwiki.apache.<wbr>org/confluence/display/Hive/<wbr>LanguageMa=
nual+<wbr>WindowingAndAnalytics</a></span></div><div style=3D"color:rgb(34,=
34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-=
variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-sp=
acing:normal;text-align:start;text-indent:0px;text-transform:none;white-spa=
ce:normal;word-spacing:0px;text-decoration-style:initial;text-decoration-co=
lor:initial"><br></div><div style=3D"color:rgb(34,34,34);font-family:arial,=
sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;=
font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:s=
tart;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0p=
x;text-decoration-style:initial;text-decoration-color:initial"><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px sol=
id rgb(204,204,204);padding-left:1ex"><font face=3D"monospace, monospace">W=
hen ORDER BY is specified with missing WINDOW clause, the WINDOW specificat=
ion defaults to=C2=A0<code>RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT RO=
W.</code></font>=C2=A0</blockquote><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><font face=3D"monospace, monospace"><code></code>When both ORDER=
 BY and WINDOW clauses are missing,=C2=A0the WINDOW specification defaults =
to=C2=A0<code>ROW BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.</cod=
e></font></blockquote></div><br class=3D"m_-4175560659618291644m_-106805682=
7686289989m_-2466685597058192438m_-2720518577023111020m_-710094449605563695=
m_6843694894649175259m_7516209879336085315gmail-m_1603939448221723691gmail-=
Apple-interchange-newline"><div>It sort of makes sense if you think about i=
t. If there is no ordering there is no way to have a bound frame. If there =
is ordering we default to the most commonly used deterministic frame.</div>=
<div><br></div></div><div><div class=3D"m_-4175560659618291644m_-1068056827=
686289989m_-2466685597058192438m_-2720518577023111020m_-710094449605563695h=
5"><div class=3D"m_-4175560659618291644m_-1068056827686289989m_-24666855970=
58192438m_-2720518577023111020m_-710094449605563695m_6843694894649175259HOE=
nZb"><div class=3D"m_-4175560659618291644m_-1068056827686289989m_-246668559=
7058192438m_-2720518577023111020m_-710094449605563695m_6843694894649175259h=
5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Tue, Apr 3,=
 2018 at 11:09 PM, Reynold Xin <span>&lt;<a href=3D"mailto:rxin@databricks.=
com" target=3D"_blank">rxin@databricks.com</a>&gt;</span> wrote:<br><blockq=
uote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc =
solid;padding-left:1ex"><div>Seems like a bug.<div><br></div><div><br></div=
></div><div class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685=
597058192438m_-2720518577023111020m_-710094449605563695m_684369489464917525=
9m_7516209879336085315HOEnZb"><div class=3D"m_-4175560659618291644m_-106805=
6827686289989m_-2466685597058192438m_-2720518577023111020m_-710094449605563=
695m_6843694894649175259m_7516209879336085315h5"><div class=3D"gmail_extra"=
><br><div class=3D"gmail_quote">On Tue, Apr 3, 2018 at 1:26 PM, Li Jin <spa=
n>&lt;<a href=3D"mailto:ice.xelloss@gmail.com" target=3D"_blank">ice.xellos=
s@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>Hi=
 Devs,<div><br></div><div>I am seeing some behavior with window functions t=
hat is a bit unintuitive and would like to get some clarification.</div><di=
v><br></div><div>When using aggregation function with window, the frame bou=
ndary seems to change depending on the order of the window.</div><div><br><=
/div><div>Example:</div><div>(1)</div><div><br></div><div>


<span></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">df =3D spark.createDataFrame([[0, 1], [0, 2], [0, 3]]).toDF=
(&#39;id&#39;, &#39;v&#39;)</span></p>


<span></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">w1 =3D Window.partitionBy(&#39;id&#39;)</span></p>


<span></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">df.withColumn(&#39;v2&#39;, mean(df.v).over(w1)).show()</sp=
an></p>


</div><div>


<span></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">+---+---+---+</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">| id|<span class=3D"m_-4175560659618291644m_-10680568276862=
89989m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_684=
3694894649175259m_7516209879336085315m_-8820554698575851792m_-3279994726151=
019714gmail-Apple-converted-space">=C2=A0 </span>v| v2|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">+---+---+---+</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">|<span class=3D"m_-4175560659618291644m_-106805682768628998=
9m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_6843694=
894649175259m_7516209879336085315m_-8820554698575851792m_-32799947261510197=
14gmail-Apple-converted-space">=C2=A0 </span>0|<span class=3D"m_-4175560659=
618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020=
m_-710094449605563695m_6843694894649175259m_7516209879336085315m_-882055469=
8575851792m_-3279994726151019714gmail-Apple-converted-space">=C2=A0 </span>=
1|2.0|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">|<span class=3D"m_-4175560659618291644m_-106805682768628998=
9m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_6843694=
894649175259m_7516209879336085315m_-8820554698575851792m_-32799947261510197=
14gmail-Apple-converted-space">=C2=A0 </span>0|<span class=3D"m_-4175560659=
618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020=
m_-710094449605563695m_6843694894649175259m_7516209879336085315m_-882055469=
8575851792m_-3279994726151019714gmail-Apple-converted-space">=C2=A0 </span>=
2|2.0|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">|<span class=3D"m_-4175560659618291644m_-106805682768628998=
9m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_6843694=
894649175259m_7516209879336085315m_-8820554698575851792m_-32799947261510197=
14gmail-Apple-converted-space">=C2=A0 </span>0|<span class=3D"m_-4175560659=
618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020=
m_-710094449605563695m_6843694894649175259m_7516209879336085315m_-882055469=
8575851792m_-3279994726151019714gmail-Apple-converted-space">=C2=A0 </span>=
3|2.0|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">+---+---+---+</span></p></div><div><br></div><div>(2)</div>=
<div><span style=3D"color:rgb(0,0,0);font-family:Menlo;font-size:11px;font-=
style:normal;font-variant-ligatures:no-common-ligatures;font-variant-caps:n=
ormal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0p=
x;text-transform:none;white-space:normal;word-spacing:0px;background-color:=
rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initia=
l;float:none;display:inline">df =3D spark.createDataFrame([[0, 1], [0, 2], =
[0, 3]]).toDF(&#39;id&#39;, &#39;v&#39;)</span><br></div><div><span style=
=3D"color:rgb(0,0,0);font-family:Menlo;font-size:11px;font-style:normal;fon=
t-variant-ligatures:no-common-ligatures;font-variant-caps:normal;font-weigh=
t:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform=
:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)=
;text-decoration-style:initial;text-decoration-color:initial;float:none;dis=
play:inline">


<span></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">w2 =3D Window.partitionBy(&#39;id&#39;).<wbr>orderBy(&#39;v=
&#39;)</span></p>


<span></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">df.withColumn(&#39;v2&#39;, mean(df.v).over(w2)).show()</sp=
an></p>


</span></div><div><span style=3D"color:rgb(0,0,0);font-family:Menlo;font-si=
ze:11px;font-style:normal;font-variant-ligatures:no-common-ligatures;font-v=
ariant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;t=
ext-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;back=
ground-color:rgb(255,255,255);text-decoration-style:initial;text-decoration=
-color:initial;float:none;display:inline">


<span></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">+---+---+---+</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">| id|<span class=3D"m_-4175560659618291644m_-10680568276862=
89989m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_684=
3694894649175259m_7516209879336085315m_-8820554698575851792m_-3279994726151=
019714gmail-Apple-converted-space">=C2=A0 </span>v| v2|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">+---+---+---+</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">|<span class=3D"m_-4175560659618291644m_-106805682768628998=
9m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_6843694=
894649175259m_7516209879336085315m_-8820554698575851792m_-32799947261510197=
14gmail-Apple-converted-space">=C2=A0 </span>0|<span class=3D"m_-4175560659=
618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020=
m_-710094449605563695m_6843694894649175259m_7516209879336085315m_-882055469=
8575851792m_-3279994726151019714gmail-Apple-converted-space">=C2=A0 </span>=
1|1.0|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">|<span class=3D"m_-4175560659618291644m_-106805682768628998=
9m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_6843694=
894649175259m_7516209879336085315m_-8820554698575851792m_-32799947261510197=
14gmail-Apple-converted-space">=C2=A0 </span>0|<span class=3D"m_-4175560659=
618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020=
m_-710094449605563695m_6843694894649175259m_7516209879336085315m_-882055469=
8575851792m_-3279994726151019714gmail-Apple-converted-space">=C2=A0 </span>=
2|1.5|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">|<span class=3D"m_-4175560659618291644m_-106805682768628998=
9m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_6843694=
894649175259m_7516209879336085315m_-8820554698575851792m_-32799947261510197=
14gmail-Apple-converted-space">=C2=A0 </span>0|<span class=3D"m_-4175560659=
618291644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020=
m_-710094449605563695m_6843694894649175259m_7516209879336085315m_-882055469=
8575851792m_-3279994726151019714gmail-Apple-converted-space">=C2=A0 </span>=
3|2.0|</span></p>
<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"margin:0px;font-style:normal;font-variant:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">+---+---+---+</span></p>


<br></span></div><div><font color=3D"#000000" face=3D"Menlo"><span style=3D=
"font-size:11px"><span style=3D"font-variant-ligatures:no-common-ligatures"=
>Seems like orderBy(&#39;v&#39;) in the example (2) also changes the frame =
boundaries from (


</span><span style=3D"font-variant-ligatures:no-common-ligatures"></span>


<p class=3D"m_-4175560659618291644m_-1068056827686289989m_-2466685597058192=
438m_-2720518577023111020m_-710094449605563695m_6843694894649175259m_751620=
9879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D=
"font-variant:normal;margin:0px;font-style:normal;font-weight:normal;font-s=
tretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb=
(0,0,0);background-color:rgb(255,255,255)"><span class=3D"m_-41755606596182=
91644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7=
10094449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575=
851792m_-3279994726151019714gmail-s1" style=3D"font-variant-ligatures:no-co=
mmon-ligatures">unboundedPreceding, unboundedFollowing</span><span style=3D=
"font-variant-ligatures:no-common-ligatures">) to (unboundedPreceding, curr=
entRow).</span></p><p class=3D"m_-4175560659618291644m_-1068056827686289989=
m_-2466685597058192438m_-2720518577023111020m_-710094449605563695m_68436948=
94649175259m_7516209879336085315m_-8820554698575851792m_-327999472615101971=
4gmail-p1" style=3D"font-variant:normal;margin:0px;font-style:normal;font-w=
eight:normal;font-stretch:normal;font-size:11px;line-height:normal;font-fam=
ily:Menlo;color:rgb(0,0,0);background-color:rgb(255,255,255)"><span style=
=3D"font-variant-ligatures:no-common-ligatures"><br></span></p><p class=3D"=
m_-4175560659618291644m_-1068056827686289989m_-2466685597058192438m_-272051=
8577023111020m_-710094449605563695m_6843694894649175259m_751620987933608531=
5m_-8820554698575851792m_-3279994726151019714gmail-p1" style=3D"font-varian=
t:normal;margin:0px;font-style:normal;font-weight:normal;font-stretch:norma=
l;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);back=
ground-color:rgb(255,255,255)"><span style=3D"font-variant-ligatures:no-com=
mon-ligatures">I found this behavior a bit unintuitive. I wonder if this be=
havior is by design and if so, what&#39;s the specific rule that orderBy() =
interacts with frame boundaries?</span></p><p class=3D"m_-41755606596182916=
44m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-7100=
94449605563695m_6843694894649175259m_7516209879336085315m_-8820554698575851=
792m_-3279994726151019714gmail-p1" style=3D"font-variant:normal;margin:0px;=
font-style:normal;font-weight:normal;font-stretch:normal;font-size:11px;lin=
e-height:normal;font-family:Menlo;color:rgb(0,0,0);background-color:rgb(255=
,255,255)"><span style=3D"font-variant-ligatures:no-common-ligatures"><br><=
/span></p><p class=3D"m_-4175560659618291644m_-1068056827686289989m_-246668=
5597058192438m_-2720518577023111020m_-710094449605563695m_68436948946491752=
59m_7516209879336085315m_-8820554698575851792m_-3279994726151019714gmail-p1=
" style=3D"font-variant-caps:normal;font-variant-numeric:normal;font-varian=
t-east-asian:normal;margin:0px;font-style:normal;font-weight:normal;font-st=
retch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(=
0,0,0);background-color:rgb(255,255,255)"><span style=3D"font-variant-ligat=
ures:no-common-ligatures">Thanks,</span></p><p class=3D"m_-4175560659618291=
644m_-1068056827686289989m_-2466685597058192438m_-2720518577023111020m_-710=
094449605563695m_6843694894649175259m_7516209879336085315m_-882055469857585=
1792m_-3279994726151019714gmail-p1" style=3D"font-variant-caps:normal;font-=
variant-numeric:normal;font-variant-east-asian:normal;margin:0px;font-style=
:normal;font-weight:normal;font-stretch:normal;font-size:11px;line-height:n=
ormal;font-family:Menlo;color:rgb(0,0,0);background-color:rgb(255,255,255)"=
><span style=3D"font-variant-ligatures:no-common-ligatures">Li</span></p></=
span></font></div><div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></div></div></blockquote></div><br></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>
</blockquote></div></div></div></div>
</blockquote></div><br></div>

--94eb2c0651e0e38ff50569693cfd--