Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DB4C619508 for ; Fri, 25 Mar 2016 01:57:25 +0000 (UTC) Received: (qmail 55897 invoked by uid 500); 25 Mar 2016 01:57:25 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 55875 invoked by uid 500); 25 Mar 2016 01:57:25 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 55866 invoked by uid 99); 25 Mar 2016 01:57:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Mar 2016 01:57:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 6BDAE2C14DC for ; Fri, 25 Mar 2016 01:57:25 +0000 (UTC) Date: Fri, 25 Mar 2016 01:57:25 +0000 (UTC) From: "Prasanth Jayachandran (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-13330) ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15211279#comment-15211279 ] Prasanth Jayachandran commented on HIVE-13330: ---------------------------------------------- [~gopalv] Can you please review the patch? > ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary > ---------------------------------------------------------------------------------------------- > > Key: HIVE-13330 > URL: https://issues.apache.org/jira/browse/HIVE-13330 > Project: Hive > Issue Type: Bug > Affects Versions: 1.3.0, 2.0.0, 2.1.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Critical > Attachments: HIVE-13330.1.patch, HIVE-13330.2.patch > > > Vectorized string dictionary reader cannot differentiate between the case where all dictionary entries are null vs single entry with empty string. This causes wrong results when reading data out of such files. > {code:title=Vectorization On} > SET hive.vectorized.execution.enabled=true; > SET hive.fetch.task.conversion=none; > select vcol from testnullorc3 limit 1; > OK > NULL > {code} > {code:title=Vectorization Off} > SET hive.vectorized.execution.enabled=false; > SET hive.fetch.task.conversion=none; > select vcol from testnullorc3 limit 1; > OK > {code} > The input table testnullorc3 contains a varchar column vcol with few empty strings and few nulls. For this table, non vectorized reader returns empty as first row but vectorized reader returns NULL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)