Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C5C07200B3B for ; Mon, 11 Jul 2016 23:21:12 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C42C3160A78; Mon, 11 Jul 2016 21:21:12 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1A95B160A5E for ; Mon, 11 Jul 2016 23:21:11 +0200 (CEST) Received: (qmail 92342 invoked by uid 500); 11 Jul 2016 21:21:11 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 92322 invoked by uid 99); 11 Jul 2016 21:21:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jul 2016 21:21:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 016402C02A3 for ; Mon, 11 Jul 2016 21:21:11 +0000 (UTC) Date: Mon, 11 Jul 2016 21:21:11 +0000 (UTC) From: "Matt McCline (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 11 Jul 2016 21:21:13 -0000 [ https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371651#comment-15371651 ] Matt McCline commented on HIVE-13974: ------------------------------------- [~owen.omalley] Thanks for looking at this. No, the semantics of sameCategoryAndAttributes is different than equals. The TypeDescription.equals method compares (type) id and maximumId which does not work when there is an interior STRUCT column with a different number of columns. It makes it seem like a type conversion is needed when one is not needed and other parts of the code throw exceptions complaining "no need to convert a STRING to a STRING". There are 3 kinds of schema not 2. Part of the problem I'm trying to solve is the ambiguity at different parts of the code as to which schema is being used. It is the one being returned by the input file format, is it the schema being fed back to the ORC raw merger that included ACID columns, or is it the unconverted file schema. I don't care what the first 2 schemas are called as long as the names are distinct. Maybe the names could be reader, internalReader, and file. About ORC-54 -- it is not practical right now in terms of time. We have got to get Erie out the door. We have so little runway left. I've had 10+ JIRAs for weeks. Whenever I knock some down more appear. Also, there really needs to be a parallel HIVE JIRA for it and we must make sure name mapping is fully supported for HIVE. Given how *difficult* Schema Evolution has been I simply don't believe it will *just work* with ORC only unit tests. FYI [~hagleitn] [~ekoifman] > ORC Schema Evolution doesn't support add columns to non-last STRUCT columns > --------------------------------------------------------------------------- > > Key: HIVE-13974 > URL: https://issues.apache.org/jira/browse/HIVE-13974 > Project: Hive > Issue Type: Bug > Components: Hive, ORC, Transactions > Affects Versions: 1.3.0, 2.1.0, 2.2.0 > Reporter: Matt McCline > Assignee: Matt McCline > Priority: Blocker > Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, HIVE-13974.09.patch, HIVE-13974.091.patch > > > Currently, the included columns are based on the fileSchema and not the readerSchema which doesn't work for adding columns to non-last STRUCT data type columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)