Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4F5E5181EF for ; Tue, 1 Dec 2015 07:43:11 +0000 (UTC) Received: (qmail 2893 invoked by uid 500); 1 Dec 2015 07:43:11 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 2865 invoked by uid 500); 1 Dec 2015 07:43:11 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 2835 invoked by uid 99); 1 Dec 2015 07:43:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Dec 2015 07:43:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 12D852C14F0 for ; Tue, 1 Dec 2015 07:43:11 +0000 (UTC) Date: Tue, 1 Dec 2015 07:43:11 +0000 (UTC) From: "Prasanth Jayachandran (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-11312) ORC format: where clause with CHAR data type not returning any rows MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11312: ----------------------------------------- Attachment: HIVE-11312.3.patch Addressed [~gopalv] comments. The conversion of string to char now happens during semantic analysis. ORC expects the string constants to be padded properly, else bloom filters will break. > ORC format: where clause with CHAR data type not returning any rows > ------------------------------------------------------------------- > > Key: HIVE-11312 > URL: https://issues.apache.org/jira/browse/HIVE-11312 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 1.2.0, 1.3.0, 1.2.1, 2.0.0 > Reporter: Thomas Friedrich > Assignee: Prasanth Jayachandran > Labels: orc > Attachments: HIVE-11312.1.patch, HIVE-11312.2.patch, HIVE-11312.3.patch > > > Test case: > Setup: > create table orc_test( col1 string, col2 char(10)) stored as orc tblproperties ("orc.compress"="NONE"); > insert into orc_test values ('val1', '1'); > Query: > select * from orc_test where col2='1'; > Query returns no row. > Problem is introduced with HIVE-10286, class RecordReaderImpl.java, method evaluatePredicateRange. > Old code: > - Object baseObj = predicate.getLiteral(PredicateLeaf.FileFormat.ORC); > - Object minValue = getConvertedStatsObj(min, baseObj); > - Object maxValue = getConvertedStatsObj(max, baseObj); > - Object predObj = getBaseObjectForComparison(baseObj, minValue); > New code: > + Object baseObj = predicate.getLiteral(); > + Object minValue = getBaseObjectForComparison(predicate.getType(), min); > + Object maxValue = getBaseObjectForComparison(predicate.getType(), max); > + Object predObj = getBaseObjectForComparison(predicate.getType(), baseObj); > The values for min and max are of type String which contain as many characters as the CHAR column indicated. For example if the type is CHAR(10), and the row has value 1, the value of String min is "1 "; > Before Hive 1.2, the method getConvertedStatsObj would call StringUtils.stripEnd(statsObj.toString(), null); which would remove the trailing spaces from min and max. Later in the compareToRange method, it was able to compare "1" with "1". > In Hive 1.2 with the use getBaseObjectForComparison method, it simply returns obj.String if the data type is String, which means minValue and maxValue are still "1 ". > As a result, the compareToRange method will return a wrong value ("1".compareTo("1 ") -9 instead of 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)