hudi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bharat Dighe <bdi...@gmail.com>
Subject Not able to query real time table when rows contains nested elements
Date Wed, 14 Oct 2020 23:02:51 GMT
Hi,

I have a MOR hudi table created with records which has some nested
elements. I am doing it in the docker demo environment.
I get an exception when I run a select query with columns which are nested
for real time view. For example:
1) spark.sql("select name, experience from users_mor_ro") //works fine for
RO view
2) spark.sql("select name from users_mor_rt") //works fine for RT view
3) spark.sql("select name, experience from users_mor_rt") //fails RT view

The 'experience' above is a nested field.

I am seeing the following exception.

20/10/11 19:53:58 ERROR executor.Executor: Exception in task 0.0 in stage
147.0 (TID 153) java.lang.UnsupportedOperationException: Cannot inspect
org.apache.hadoop.io.Text at
org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldData(ArrayWritableObjectInspector.java:152)
at
org.apache.spark.sql.hive.HiveInspectors$$anonfun$4$$anonfun$apply$7.apply(HiveInspectors.scala:688)

I have created https://issues.apache.org/jira/browse/HUDI-1340
I have added my code, avro files, and scala code to this JIRA.

Queries work fine with Hive.

Please share if there is a workaround.

Thanks
Bharat

Mime
View raw message