spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rishabh Bhardwaj (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-13458) Datasets cannot be sorted
Date Thu, 17 Mar 2016 11:11:33 GMT

    [ https://issues.apache.org/jira/browse/SPARK-13458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199324#comment-15199324
] 

Rishabh Bhardwaj commented on SPARK-13458:
------------------------------------------

[~obeattie] In the master branch,DataSet.scala have these methods. Is there something else
you are looking for ?
{code}
scala> val ds = sqlContext.createDataFrame(Seq((1,2),(2,3),(34,2),(4,45),(56,444))).as[(Int,Int)]
ds: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int, _2: int]

scala> ds.sort("_1")
res5: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int, _2: int]

scala> res5.show
+---+---+
| _1| _2|
+---+---+
|  1|  2|
|  2|  3|
|  4| 45|
| 34|  2|
| 56|444|
+---+---+
{code}

> Datasets cannot be sorted
> -------------------------
>
>                 Key: SPARK-13458
>                 URL: https://issues.apache.org/jira/browse/SPARK-13458
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Oliver Beattie
>
> There doesn't appear to be any way to sort a {{Dataset}} at present, without first converting
it to a {{DataFrame}}.
> Methods like {{orderBy}}, {{sort}}, and {{sortWithinPartitions}} which are present on
{{DataFrame}}, or {{sortBy}} which is present on {{RDD}}, are absent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message