Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DA75417703 for ; Thu, 23 Apr 2015 01:02:38 +0000 (UTC) Received: (qmail 3332 invoked by uid 500); 23 Apr 2015 01:02:38 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 3254 invoked by uid 500); 23 Apr 2015 01:02:38 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 3243 invoked by uid 99); 23 Apr 2015 01:02:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2015 01:02:38 +0000 Date: Thu, 23 Apr 2015 01:02:38 +0000 (UTC) From: "Pengcheng Xiong (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-10455) CBO (Calcite Return Path): Different data types at Reducer before JoinOp MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Pengcheng Xiong created HIVE-10455: -------------------------------------- Summary: CBO (Calcite Return Path): Different data types at Reducer before JoinOp Key: HIVE-10455 URL: https://issues.apache.org/jira/browse/HIVE-10455 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong The following error occured for cbo_subq_not_in.q {code} java.lang.Exception: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable to deserialize reduce input key from x1x128x0x0x1 with properties {columns=reducesinkkey0, serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, serialization.sort.order=+, columns.types=double} at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) {code} A more easier way to reproduce is {code} set hive.cbo.enable=true; set hive.exec.check.crossproducts=false; set hive.stats.fetch.column.stats=true; set hive.auto.convert.join=false; select p_size, src.key from part join src on p_size=key; {code} As you can see, p_size is integer while src.key is string. Both of them should be cast to double when they join. When return path is off, this will happen before Join, at RS. However, when return path is on, this will be considered as an expression in Join. Thus, when reducer is collecting different types of keys from different join branches, it throws exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)