Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A40F6200C7E for ; Tue, 23 May 2017 09:29:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A2ADB160BC3; Tue, 23 May 2017 07:29:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EA3DD160BB6 for ; Tue, 23 May 2017 09:29:08 +0200 (CEST) Received: (qmail 69330 invoked by uid 500); 23 May 2017 07:29:07 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 69321 invoked by uid 99); 23 May 2017 07:29:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 May 2017 07:29:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 0B78D1AA291 for ; Tue, 23 May 2017 07:29:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id BTTshR3jR7LW for ; Tue, 23 May 2017 07:29:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 1F4625F523 for ; Tue, 23 May 2017 07:29:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 525D6E005B for ; Tue, 23 May 2017 07:29:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1153221B56 for ; Tue, 23 May 2017 07:29:04 +0000 (UTC) Date: Tue, 23 May 2017 07:29:04 +0000 (UTC) From: "Remus Rusanu (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-16667) PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and other field is incorrect MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 23 May 2017 07:29:09 -0000 [ https://issues.apache.org/jira/browse/HIVE-16667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020759#comment-16020759 ] Remus Rusanu commented on HIVE-16667: ------------------------------------- I don't see how your patch will address the problem of mixing pre v3.0 upgraded tables and post v.30 created one. the new ones will store the {{TYPE_NAME}} as a LOB handle, while the upgraded ones will have the inlined string value. Can we explore instead reverting the column type mapping to {{VARCHAR}} in the package.jdo, (or {{LONGVARCHAR}} if need be) instead of {{CLOB}}? Keep the metastore upgrade scripts, change the underlying storage tyopes to respective large types, but have JDO map them to String, not Clob. On my testing with PG, this works correctly on all cases I tested (my repro, your large serde DDL from HIVE-12274), and should handle upgrade correctly. But I did not test this with other metastore engines, Oracle, MySQL etc. > PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and other field is incorrect > ------------------------------------------------------------------------------------------------- > > Key: HIVE-16667 > URL: https://issues.apache.org/jira/browse/HIVE-16667 > Project: Hive > Issue Type: Bug > Reporter: Remus Rusanu > Assignee: Naveen Gangam > Attachments: HIVE-16667.patch, HiveCLIOutput.txt, PostgresDBOutput.txt > > > The CLOB JDO type introduced with HIVE-12274 does not work correctly with PostgreSQL. The value is written out-of-band and the LOB handle is written,as an INT, into the table. SELECTs return the INT value, which should had been read via the {{lo_get}} PG built-in, and then cast into string. > Furthermore, the behavior is different between fields upgraded from earlier metastore versions (they retain their string storage) vs. values inserted after the upgrade (inserted as LOB roots). > Teh code in {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} expects the underlying JDO/Datanucleus to map the column to a {{Clob}} but that does not happen, the value is a Java String containing the int which is the LOB root saved by PG. > This manifests at runtime with errors like: > {code} > hive> select * from srcpart; > Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Error: type expected at the position 0 of '24030:24031' but '24030' is found. > {code} > the 24030:24031 should be 'string:string'. > repro: > {code} > CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE; > LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO TABLE srcpart PARTITION (ds="2008-04-09", hr="11"); > select * from srcpart; > {code} > I did not see the issue being hit by non-partitioned/textfile tables, but that is just the luck of the path taken by the code. Inspection of my PG metastore shows all the CLOB fields suffering from this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)