hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bobeff (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-13534) xception when trying to access TIMESTAMP columns into parquet file using hive external table
Date Mon, 18 Apr 2016 10:46:25 GMT
Bobeff created HIVE-13534:
-----------------------------

             Summary: xception when trying to access TIMESTAMP columns into parquet file using
hive external table 
                 Key: HIVE-13534
                 URL: https://issues.apache.org/jira/browse/HIVE-13534
             Project: Hive
          Issue Type: Bug
          Components: File Formats, Hive, Import/Export, JDBC
            Reporter: Bobeff
            Assignee: Sushanth Sowmyan
            Priority: Critical


Imported data was stored from a netezza datasource using a sqoop import command like this


SQL DDL creation script of imported table looks like this 

CREATE TABLE "ADMIN"."MIS_AUX_ITR" ( 
"DDEBVAL" DATE, 
"DFINVAL" DATE, 
"NAUX" VARCHAR(6), 
"CDMNITR" VARCHAR(3), 
"CDERIMG" VARCHAR(1), 
"DDERIMG" DATE 
); 

Import sqoop job is the following 

sqoop job 
--create import-name 
-- import 
--connect jdbc:netezza://server:port/database 
--username user 
--password pwd 
--table MIS_AUX_ITR 
--as-parquetfile 
--target-dir hdfs:///prod/ZA/dee/MIS_AUX_ITR 
-m 1 

After import parquet file schema is the following 

> yarn jar /tmp/parquet-tools-1.6.0.jar schema /prod/ZA/dee/MIS_AUX_ITR/2cf3e971-4c2c-408f-bd86-5d3cf3bd4fa5.parquet


message MIS_AUX_ITR { 
optional int64 DDEBVAL; 
optional int64 DFINVAL; 
optional binary NAUX (UTF8); 
optional binary CDMNITR (UTF8); 
optional binary CDERIMG (UTF8); 
optional int64 DDERIMG; 
} 

In order to access data stored into the parquet file we created the external table below 

CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR 
( 
`DDEBVAL`	DATE, 
`DFINVAL`	DATE, 
`NAUX`	VARCHAR(6), 
`CDMNITR`	VARCHAR(3), 
`CDERIMG`	VARCHAR(1), 
`DDERIMG`	DATE 
) 
COMMENT 'Table DEE MIS_AUX_ITR' 
STORED AS PARQUET 
LOCATION 
'/prod/ZA/dee/MIS_AUX_ITR'; 


But when we try to list data from external table above we get the following exception 

hive> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD 
> ( 
> `DDEBVAL`DATE, 
> `DFINVAL`DATE, 
> `NAUX`VARCHAR(6), 
> `CDMNITR`VARCHAR(3), 
> `CDERIMG`VARCHAR(1), 
> `DDERIMG`DATE 
> ) 
> COMMENT 'Table DEE MIS_AUX_ITR_V_PROD' 
> STORED AS PARQUET 
> LOCATION 
> '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; 
OK 
Time taken: 0.196 seconds 
hive> select * from za_dee.MIS_AUX_ITR_V_PPROD limit 100; 
OK 
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". 
SLF4J: Defaulting to no-operation (NOP) logger implementation 
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.DateWritable

Time taken: 0.529 seconds 
hive> 


We also tried with the following external table 

CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_BI 
( 
`DDEBVAL`	BIGINT, 
`DFINVAL`	BIGINT, 
`NAUX`	VARCHAR(6), 
`CDMNITR`	VARCHAR(3), 
`CDERIMG`	VARCHAR(1), 
`DDERIMG`	BIGINT 
) 
COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_BI' 
STORED AS PARQUET 
LOCATION '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; 

Then the “Date” columns are shown as “timestamp” values as below 
hive> select DDEBVAL from za_dee.MIS_AUX_ITR_V_PPROD_BI limit 5; 
OK 
1080770400000 
1080770400000 
1080770400000 
1080770400000 
1080770400000 
Time taken: 0.081 seconds, Fetched: 5 row(s) 
hive> 

However “Date” values can be listed by casting as Timestamp 
hive> select cast(DDEBVAL as Timestamp) from za_dee.MIS_AUX_ITR_V_PPROD_BI limit 5; 
OK 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
Time taken: 0.087 seconds, Fetched: 5 row(s) 
hive> 

We also have tested with an external table using TIMESTAMP type as shown below 
CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR 
( 
`DDEBVAL`	TIMESTAMP, 
`DFINVAL`	TIMESTAMP, 
`NAUX`	VARCHAR(6), 
`CDMNITR`	VARCHAR(3), 
`CDERIMG`	VARCHAR(1), 
`DDERIMG`	TIMESTAMP 
) 
COMMENT 'Table DEE MIS_AUX_ITR' 
STORED AS PARQUET 
LOCATION 
'/prod/ZA/dee/MIS_AUX_ITR'; 

But we got the same behavior: an exception when trying to access data from an Oracle DB.

I tried this 

CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_TS 
( 
`DDEBVAL`	TIMESTAMP, 
`DFINVAL`	TIMESTAMP, 
`NAUX`	VARCHAR(6), 
`CDMNITR`	VARCHAR(3), 
`CDERIMG`	VARCHAR(1), 
`DDERIMG`	TIMESTAMP 
) 
COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_TS' 
STORED AS PARQUET 
LOCATION 
'/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; 

and then i created and launched the sqoop job below 

sqoop job --create import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD -- import 
--connect jdbc:netezza:/<server>:<port>/db 
--username <user> 
--password <password> 
--table MIS_AUX_ITR_V 
--as-parquetfile 
--hive-import 
--hive-overwrite 
--hive-database za_dee 
--hive-table MIS_AUX_ITR_V_PPROD_TS 
-m 1 

sqoop job --exec import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD 

the raising error is the following 

16/04/11 17:15:09 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-sqoop/compile/3533e18a81a65fb8eb88ec9cef2f4688/codegen_MIS_AUX_ITR_V.jar

16/04/11 17:15:09 WARN manager.NetezzaManager: It looks like you are importing from Netezza.

16/04/11 17:15:09 WARN manager.NetezzaManager: This transfer can be faster! Use the --direct

16/04/11 17:15:09 WARN manager.NetezzaManager: option to exercise a Netezza-specific fast
path. 
16/04/11 17:15:17 INFO mapreduce.ImportJobBase: Beginning import of MIS_AUX_ITR_V 
16/04/11 17:15:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V"
AS t WHERE 1=0 
16/04/11 17:15:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V"
AS t WHERE 1=0 
16/04/11 17:15:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V"
AS t WHERE 1=0 
16/04/11 17:15:19 INFO hive.metastore: Trying to connect to metastore with URI thrift://slhdm007.maif.local:9083

16/04/11 17:15:19 INFO hive.metastore: Connected to metastore. 
16/04/11 17:15:19 ERROR tool.ImportTool: Imported Failed: Cannot convert unsupported type:
timestamp 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message