hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naveen Gangam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16667) PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and other field is incorrect
Date Tue, 16 May 2017 14:58:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012517#comment-16012517
] 

Naveen Gangam commented on HIVE-16667:
--------------------------------------

[~rusanu] Could you please post the full stack trace if you still have it? I do not know of
these nuanes across these different DBs, so apologize if some of these questions seem obvious.

In the upgrade script, when we cast the existing String values as text, shouldnt the existing
columns values also be treated the same as any new inserts? If not, is there an alternate
way of making existing column values be the same format as new values during the upgrade?
{{alter table "COLUMNS_V2" alter column "TYPE_NAME" type text using cast("TYPE_NAME" as text);}}

If the JDO mapping file defines a column as CLOB, wouldnt the column type be of Clob irrespective
of the underlying DB? Thanks 

> PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and other field
is incorrect
> -------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-16667
>                 URL: https://issues.apache.org/jira/browse/HIVE-16667
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Remus Rusanu
>            Assignee: Naveen Gangam
>
> The CLOB JDO type introduced with HIVE-12274 does not work correctly with PostgreSQL.
The value is written out-of-band and the LOB handle is written,as an INT, into the table.
SELECTs return the INT value, which should had been read via the {{lo_get}} PG built-in, and
then cast into string.
> Furthermore, the behavior is different between fields upgraded from earlier metastore
versions (they retain their string storage) vs. values inserted after the upgrade (inserted
as LOB roots).
> Teh code in {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} expects
the underlying JDO/Datanucleus to map the column to a {{Clob}} but that does not happen, the
value is a Java String containing the int which is the LOB root saved by PG.
> This manifests at runtime with errors like:
> {code}
> hive> select * from srcpart;
> Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Error:
type expected at the position 0 of '24030:24031' but '24030' is found.
> {code}
> the 24030:24031 should be 'string:string'.
> repro:
> {code}
> CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 'default') PARTITIONED
BY (ds STRING, hr STRING) STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO TABLE srcpart
PARTITION (ds="2008-04-09", hr="11");
> select * from srcpart;
> {code}
> I did not see the issue being hit by non-partitioned/textfile tables, but that is just
the luck of the path taken by the code. Inspection of my PG metastore shows all the CLOB fields
suffering from this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message