spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dongjoon-hyun <...@git.apache.org>
Subject [GitHub] spark pull request #22932: [SPARK-25102][SQL] Write Spark version to ORC/Par...
Date Sat, 10 Nov 2018 00:17:06 GMT
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22932#discussion_r232428893
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala ---
    @@ -274,6 +278,15 @@ private[orc] class OrcOutputWriter(
     
       override def close(): Unit = {
         if (recordWriterInstantiated) {
    +      // Hive 1.2.1 ORC initializes its private `writer` field at the first write.
    +      try {
    +        val writerField = recordWriter.getClass.getDeclaredField("writer")
    +        writerField.setAccessible(true)
    +        val writer = writerField.get(recordWriter).asInstanceOf[Writer]
    +        writer.addUserMetadata(SPARK_VERSION_METADATA_KEY, UTF_8.encode(SPARK_VERSION_SHORT))
    +      } catch {
    +        case NonFatal(e) => log.warn(e.toString, e)
    +      }
    --- End diff --
    
    BTW, as you expected, we cannot use a single function for this. The `Writer` are not the
same.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message