Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A04D51790B for ; Mon, 29 Sep 2014 22:19:35 +0000 (UTC) Received: (qmail 96682 invoked by uid 500); 29 Sep 2014 22:19:35 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 96613 invoked by uid 500); 29 Sep 2014 22:19:35 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 96600 invoked by uid 500); 29 Sep 2014 22:19:35 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 96597 invoked by uid 99); 29 Sep 2014 22:19:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Sep 2014 22:19:35 +0000 Date: Mon, 29 Sep 2014 22:19:35 +0000 (UTC) From: "Doug Sedlak (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-8297) Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Doug Sedlak created HIVE-8297: --------------------------------- Summary: Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format Key: HIVE-8297 URL: https://issues.apache.org/jira/browse/HIVE-8297 Project: Hive Issue Type: Bug Components: CLI, JDBC Affects Versions: 0.13.0 Environment: Linux Reporter: Doug Sedlak For the case: SELECT * FROM [table] JDBC direct reads the table backing data, versus cranking up a MR and creating a result set. Where table format is RCFile or ORC, incorrect results are delivered by JDBC direct read for TIMESTAMP columns. If you force a result set, correct data is returned. To reproduce using beeline: 1) Create this file as follows in HDFS. $ cat > /tmp/ts.txt 2014-09-28 00:00:00 2014-09-29 00:00:00 2014-09-30 00:00:00 $ hadoop fs -copyFromLocal /tmp/ts.txt /tmp/ts.txt 2) In beeline load above HDFS data to a TEXTFILE table, and verify ok: $ beeline > !connect jdbc:hive2://:/ hive pass org.apache.hive.jdbc.HiveDriver > drop table `TIMESTAMP_TEXT`; > CREATE TABLE `TIMESTAMP_TEXT` (`ts` TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE; > LOAD DATA INPATH '/tmp/ts.txt' OVERWRITE INTO TABLE `TIMESTAMP_TEXT`; > select * from `TIMESTAMP_TEXT`; 3) In beeline create and load an RCFile from the TEXTFILE: > drop table `TIMESTAMP_RCFILE`; > CREATE TABLE `TIMESTAMP_RCFILE` (`ts` TIMESTAMP) stored as rcfile; > INSERT INTO TABLE `TIMESTAMP_RCFILE` SELECT * FROM `TIMESTAMP_TEXT`; 4) Demonstrate incorrect direct JDBC read versus good read by inducing result set creation: > SELECT * FROM `TIMESTAMP_RCFILE`; +------------------------+ | timestamp_rcfile.ts | +------------------------+ | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | +------------------------+ > SELECT * FROM `TIMESTAMP_RCFILE` where ts is not NULL; +------------------------+ | timestamp_rcfile.ts | +------------------------+ | 2014-09-28 00:00:00.0 | | 2014-09-29 00:00:00.0 | | 2014-09-30 00:00:00.0 | +------------------------+ Note 1: The incorrect conduct demonstrated above replicates with a standalone Java/JDBC program. Note 2: Don't know if this is an issue with any other data types, also don't know what releases affected, however this occurs in Hive 13. Direct JDBC read of TEXTFILE and SEQUENCEFILE work fine. As above for RCFile and ORC wrong results are delivered, did not test any other file types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)