hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naveen Gangam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16667) PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and other field is incorrect
Date Wed, 17 May 2017 17:27:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014450#comment-16014450
] 

Naveen Gangam commented on HIVE-16667:
--------------------------------------

I only have these 2 jars for the driver. It has to be one of them {{postgresql-9.0-801.jdbc4.jar}}
and {{postgresql-9.1-901.jdbc4.jar}}. They are exactly the same size, so I am assuming they
are the same.

{code}
hive> CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 'default')
PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE;
OK
Time taken: 13.775 seconds
hive> LOAD DATA LOCAL INPATH "/Users/ngangam/apache/hive/data/files/kv1.txt" OVERWRITE
INTO TABLE srcpart PARTITION (ds="2008-04-09",hr="11");
Loading data to table default.srcpart partition (ds=2008-04-09, hr=11)
OK
Time taken: 132.365 seconds
hive> select * from srcpart;
OK
238	val_238	2008-04-09	11
86	val_86	2008-04-09	11
311	val_311	2008-04-09	11
27	val_27	2008-04-09	11
165	val_165	2008-04-09	11
409	val_409	2008-04-09	11
....
200	val_200	2008-04-09	11
97	val_97	2008-04-09	11
Time taken: 17.693 seconds, Fetched: 500 row(s)
{code}

My hive-site.xml has the following .. should be the same as yours as far as connection params
and a driver are concerned. 
{code}
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>org.postgresql.Driver</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive1</value>
  </property>
{code}


> PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and other field
is incorrect
> -------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-16667
>                 URL: https://issues.apache.org/jira/browse/HIVE-16667
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Remus Rusanu
>            Assignee: Naveen Gangam
>
> The CLOB JDO type introduced with HIVE-12274 does not work correctly with PostgreSQL.
The value is written out-of-band and the LOB handle is written,as an INT, into the table.
SELECTs return the INT value, which should had been read via the {{lo_get}} PG built-in, and
then cast into string.
> Furthermore, the behavior is different between fields upgraded from earlier metastore
versions (they retain their string storage) vs. values inserted after the upgrade (inserted
as LOB roots).
> Teh code in {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} expects
the underlying JDO/Datanucleus to map the column to a {{Clob}} but that does not happen, the
value is a Java String containing the int which is the LOB root saved by PG.
> This manifests at runtime with errors like:
> {code}
> hive> select * from srcpart;
> Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Error:
type expected at the position 0 of '24030:24031' but '24030' is found.
> {code}
> the 24030:24031 should be 'string:string'.
> repro:
> {code}
> CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 'default') PARTITIONED
BY (ds STRING, hr STRING) STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO TABLE srcpart
PARTITION (ds="2008-04-09", hr="11");
> select * from srcpart;
> {code}
> I did not see the issue being hit by non-partitioned/textfile tables, but that is just
the luck of the path taken by the code. Inspection of my PG metastore shows all the CLOB fields
suffering from this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message