hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wong (Jira)" <>
Subject [jira] [Created] (HUDI-628) MultiPartKeysValueExtractor does not work with HoodieHiveClient.getPartitionClause
Date Fri, 21 Feb 2020 23:06:00 GMT
Andrew Wong created HUDI-628:

             Summary: MultiPartKeysValueExtractor does not work with HoodieHiveClient.getPartitionClause
                 Key: HUDI-628
             Project: Apache Hudi (incubating)
          Issue Type: Bug
            Reporter: Andrew Wong

The [] example data has a column `partitionpath`
which holds values like `americas/brazil/sao_paulo`. Using the docker environment, you can
change the basePath from the quickstart to save to hdfs://user/hive/warehouse/hudi_trips_cow.
Then you can see the folder in the HDFS browser, similar to the stock_ticks_cow folder created
in the docker demo.

However, if you try to use to sync the table to Hive, you get the error:
"java.lang.IllegalArgumentException: Partition key parts [partitionpath] does not match with
partition values [americas, brazil, sao_paulo]. Check partition strategy. "
{quote}{{/var/hoodie/ws/hudi-hive/ --jdbc-url jdbc:hive2://hiveserver:10000
--user hive --pass hive --partitioned-by partitionpath --partition-value-extractor org.apache.hudi.hive.MultiPartKeysValueExtractor
-MultiPartKeysValueExtractor -base-path /user/hive/warehouse/hudi_trips_cow --database default
--table hudi_trips_cow}}
This error is thrown in `HoodieHiveClient.getPartitionClause`, which uses `extractPartitionValuesInPath`
to get a list of partitionValues. The problem is that it compares the length of the partitionValues
to the length of the partitionField. In this example, there is only 1 partitionField, "partitionpath,"
which is split into 3 partitionValues. Thus the check fails and throws the exception. 

See []


This message was sent by Atlassian Jira

View raw message