spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gatorsmile <>
Subject [GitHub] spark pull request #16700: [SPARK-19359][SQL]clear useless path after rename...
Date Fri, 27 Jan 2017 02:03:56 GMT
Github user gatorsmile commented on a diff in the pull request:
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
    @@ -899,6 +918,22 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf:
               spec, partitionColumnNames, tablePath)
             try {
               tablePath.getFileSystem(hadoopConf).rename(wrongPath, rightPath)
    +          // If the newSpec contains more than one depth partition, FileSystem.rename
just deletes
    +          // the leaf(i.e. wrongPath), we should check if wrongPath's parents need to
be deleted.
    +          // for example:
    +          // newSpec is 'A=1/B=2', after renamePartitions by Hive, the location path
in FileSystem
    +          // is changed to 'a=1/b=2', which is wrongPath, then we renamed to 'A=1/B=2',
    +          // 'a=1/b=2' in FileSystem is deleted, while 'a=1' is already exists,
    +          // which should also be deleted
    --- End diff --
    How about?
    > For example, give a newSpec 'A=1/B=2', after calling Hive's client.renamePartitions,
the location path in FileSystem is changed to 'a=1/b=2', which is wrongPath. Then, although
we renamed it to 'A=1/B=2', 'a=1/b=2' in FileSystem is deleted but 'a=1' still exists. We
also need to delete the useless directory.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message