spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From viirya <...@git.apache.org>
Subject [GitHub] spark pull request #19570: [SPARK-22335][SQL] Clarify union behavior on Data...
Date Sat, 28 Oct 2017 00:31:24 GMT
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19570#discussion_r147540274
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -1753,6 +1753,27 @@ class Dataset[T] private[sql](
        *
        * Also as standard in SQL, this function resolves columns by position (not by name).
        *
    +   * Notice that the column positions in the schema aren't necessarily matched with the
    +   * fields in the typed objects in a Dataset. This function resolves columns by their
positions
    +   * in the schema, not the fields in the typed objects, as this Scala example shows:
    +   *
    +   * {{{
    +   *   case class Test(a: String, b: String)
    +   *   val ds1 = Seq(("a", "b")).toDF("a", "b").as[Test] // ds1's schema: [a: String,
b: String]
    +   *   val ds2 = Seq(("b", "a")).toDF("b", "a").as[Test] // ds2's schema: [b: String,
a: String]
    +   *   ds1.union(ds2).show
    +   *
    +   *   // output:
    +   *   // +---+---+
    +   *   // |  a|  b|
    +   *   // +---+---+
    +   *   // |  a|  b|
    +   *   // |  b|  a|
    +   *   // +---+---+
    --- End diff --
    
    Sorry, I don't get the meaning of same example as `union`. This is the only example of
`union`, if I don't miss anything.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message