spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cloud-fan <...@git.apache.org>
Subject [GitHub] spark pull request #19389: [SPARK-22165][SQL] Resolve type conflicts between...
Date Mon, 13 Nov 2017 22:39:47 GMT
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19389#discussion_r150687775
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
---
    @@ -468,14 +460,16 @@ object PartitioningUtils {
       }
     
       /**
    -   * Given a collection of [[Literal]]s, resolves possible type conflicts by up-casting
"lower"
    -   * types.
    +   * Given a collection of [[Literal]]s, resolves possible type conflicts by
    +   * [[TypeCoercion.findWiderCommonType]]. See [[TypeCoercion.findWiderTypeForTwo]].
        */
       private def resolveTypeConflicts(literals: Seq[Literal], timeZone: TimeZone): Seq[Literal]
= {
    -    val desiredType = {
    -      val topType = literals.map(_.dataType).maxBy(upCastingOrder.indexOf(_))
    --- End diff --
    
    Partitioned columns are different from normal type coercion cases, they are literally
all string type, and we are just trying to find a most reasonable type of them.
    
    The previous behavior was there since the very beginning, which I think didn't go through
a decent discussion. This is the first time we seriously design the type merging logic for
partition discovery. I think it doesn't need to be blocked by the type coercion stabilization
work, as they can diverge.
    
    @HyukjinKwon can you send the proposal to dev list? I think we need more feedback, e.g.
people may want more strict rules and have more cases to fallback to string.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message