spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Okolnychyi <anton.okolnyc...@gmail.com>
Subject Re: Expand the Spark SQL programming guide?
Date Sun, 18 Dec 2016 14:09:29 GMT
Any comments/suggestions are more than welcome.

Thanks,
Anton

2016-12-18 15:08 GMT+01:00 Anton Okolnychyi <anton.okolnychyi@gmail.com>:

> Here is the pull request: https://github.com/apache/spark/pull/16329
>
>
>
> 2016-12-16 20:54 GMT+01:00 Jim Hughes <jnh5y@ccri.com>:
>
>> I'd be happy to review a PR.  At the minute, I'm still learning Spark
>> SQL, so writing documentation might be a bit of a stretch, but reviewing
>> would be fine.
>>
>> Thanks!
>>
>>
>> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>>
>> Yes - that sounds good Anton, I can work on documenting the window
>> functions.
>>
>>
>>
>> *From: *Anton Okolnychyi <anton.okolnychyi@gmail.com>
>> <anton.okolnychyi@gmail.com>
>> *Date: *Thursday, December 15, 2016 at 4:34 PM
>> *To: *Conversant <jthakrar@conversantmedia.com>
>> <jthakrar@conversantmedia.com>
>> *Cc: *Michael Armbrust <michael@databricks.com> <michael@databricks.com>,
>> Jim Hughes <jnh5y@ccri.com> <jnh5y@ccri.com>, "dev@spark.apache.org"
>> <dev@spark.apache.org> <dev@spark.apache.org> <dev@spark.apache.org>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>
>>
>> I think it will make sense to show a sample implementation of
>> UserDefinedAggregateFunction for DataFrames, and an example of the
>> Aggregator API for typed Datasets.
>>
>>
>>
>> Jim, what if I submit a PR and you join the review process? I also do not
>> mind to split this if you want, but it seems to be an overkill for this
>> part.
>>
>>
>>
>> Jayesh, shall I skip the window functions part since you are going to
>> work on that?
>>
>>
>>
>> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jthakrar@conversantmedia.com>
>> :
>>
>> I too am interested in expanding the documentation for Spark SQL.
>>
>> For my work I needed to get some info/examples/guidance on window
>> functions and have been using https://databricks.com/blog/20
>> 15/07/15/introducing-window-functions-in-spark-sql.html .
>>
>> How about divide and conquer?
>>
>>
>>
>>
>>
>> *From: *Michael Armbrust <michael@databricks.com>
>> *Date: *Thursday, December 15, 2016 at 3:21 PM
>> *To: *Jim Hughes < <jnh5y@ccri.com>jnh5y@ccri.com>
>> *Cc: *"dev@spark.apache.org" <dev@spark.apache.org>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>
>>
>> Pull requests would be welcome for any major missing features in the
>> guide:
>> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
>> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>>
>>
>>
>> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jnh5y@ccri.com> wrote:
>>
>> Hi Anton,
>>
>> I'd like to see this as well.  I've been working on implementing
>> geospatial user-defined types and functions.  Having examples of
>> aggregations and window functions would be awesome!
>>
>> I did test out implementing a distributed convex hull as a
>> UserDefinedAggregateFunction, and that seemed to work sensibly.
>>
>> Cheers,
>>
>> Jim
>>
>>
>>
>> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>>
>> Hi,
>>
>>
>>
>> I am wondering whether it makes sense to expand the Spark SQL programming
>> guide with examples of aggregations (including user-defined via the
>> Aggregator API) and window functions.  For instance, there might be a
>> separate subsection under "Getting Started" for each functionality.
>>
>>
>>
>> SPARK-16046 seems to be related but there is no activity for more than 4
>> months.
>>
>>
>>
>> Best regards,
>>
>> Anton
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Mime
View raw message