hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wong (Jira)" <j...@apache.org>
Subject [jira] [Updated] (HUDI-628) MultiPartKeysValueExtractor does not work with run_sync_tool.sh
Date Fri, 21 Feb 2020 23:09:00 GMT

     [ https://issues.apache.org/jira/browse/HUDI-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Wong updated HUDI-628:
-----------------------------
    Summary: MultiPartKeysValueExtractor does not work with run_sync_tool.sh  (was: MultiPartKeysValueExtractor
does not work with HoodieHiveClient.getPartitionClause)

> MultiPartKeysValueExtractor does not work with run_sync_tool.sh
> ---------------------------------------------------------------
>
>                 Key: HUDI-628
>                 URL: https://issues.apache.org/jira/browse/HUDI-628
>             Project: Apache Hudi (incubating)
>          Issue Type: Bug
>            Reporter: Andrew Wong
>            Priority: Major
>         Attachments: stack_trace.txt
>
>
> The [https://hudi.apache.org/docs/quick-start-guide.html] example data has a column
`partitionpath` which holds values like `americas/brazil/sao_paulo`. Using the docker environment,
you can change the basePath from the quickstart to save to hdfs://user/hive/warehouse/hudi_trips_cow.
Then you can see the folder in the HDFS browser, similar to the stock_ticks_cow folder created
in the docker demo.
> However, if you try to use run_sync_tool.sh to sync the table to Hive, you get the error:
"java.lang.IllegalArgumentException: Partition key parts [partitionpath] does not match with
partition values [americas, brazil, sao_paulo]. Check partition strategy. "
> {quote}{{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url jdbc:hive2://hiveserver:10000
--user hive --pass hive --partitioned-by partitionpath --partition-value-extractor org.apache.hudi.hive.MultiPartKeysValueExtractor
-MultiPartKeysValueExtractor -base-path /user/hive/warehouse/hudi_trips_cow --database default
--table hudi_trips_cow}}
> {quote}
> This error is thrown in `HoodieHiveClient.getPartitionClause`, which uses `extractPartitionValuesInPath`
to get a list of partitionValues. The problem is that it compares the length of the partitionValues
to the length of the partitionField. In this example, there is only 1 partitionField, "partitionpath,"
which is split into 3 partitionValues. Thus the check fails and throws the exception. 
> See [https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L182]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message