Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4E6DA200C10 for ; Fri, 3 Feb 2017 21:04:25 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 4D03E160B43; Fri, 3 Feb 2017 20:04:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 968AD160B3F for ; Fri, 3 Feb 2017 21:04:24 +0100 (CET) Received: (qmail 25477 invoked by uid 500); 3 Feb 2017 20:04:23 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 25466 invoked by uid 99); 3 Feb 2017 20:04:23 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Feb 2017 20:04:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 14C15181BC0 for ; Fri, 3 Feb 2017 20:04:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id I5RjsL9SbNcH for ; Fri, 3 Feb 2017 20:04:22 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 40F3F5F5F8 for ; Fri, 3 Feb 2017 20:04:21 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v13K4EjK032049; Fri, 3 Feb 2017 20:04:14 GMT Message-Id: <201702032004.v13K4EjK032049@ip-10-146-233-104.ec2.internal> Date: Fri, 3 Feb 2017 20:04:14 +0000 From: "Sailesh Mukil (Code Review)" To: Matthew Jacobs , impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Dan Hecht , Thomas Tauber-Marshall Reply-To: sailesh@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4828=3A_Alter_Kudu_schema_outside_Impala_may_crash_on_read=0A?= X-Gerrit-Change-Id: I6d43f5bb9811e728ad592933066d006c8fb4553a X-Gerrit-ChangeURL: X-Gerrit-Commit: ab9b6a4d944f92fb2a4b7457fc3b839d0e673d9d In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.2 archived-at: Fri, 03 Feb 2017 20:04:25 -0000 Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4828: Alter Kudu schema outside Impala may crash on read ...................................................................... Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/5840/1//COMMIT_MSG Commit Message: Line 20: open, even if the schema changes concurrently. It would probably be worth adding that this is the earliest point and the coarsest granularity at which we can successfully detect a schema change if there is any, without leaving room for error, just to assuage any concerns of whether we're doing this check too often. http://gerrit.cloudera.org:8080/#/c/5840/1/be/src/exec/kudu-scanner.cc File be/src/exec/kudu-scanner.cc: PS1, Line 133: for (int i = 0; i < tuple_desc->slots().size(); ++i) { As we spoke, if an ALTER TABLE is done and a column is removed, a crash might occur in this loop. Another point I wanted to call out: What's the expected behavior when a column is added? 1) Currently, this might pass and we will probably return the results for N-1 columns successfully. Is that what we want? 2) Or should we fail the query? 3) Or should we return the results and also give a warning asking the user to REFRESH? My vote is for the 3rd option, but I'm open to others take on this. http://gerrit.cloudera.org:8080/#/c/5840/1/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: PS1, Line 77: PrimitiveType KuduColTypeToPrimitiveType( : const KuduColumnSchema::DataType& type) { This may be an over optimization, but what if we had an array mapping KuduColumnSchemas to PrimitiveTypes? Since they both are just enums. Just worried about the case where there are a large number of columns and we bounce on the switch statement for every one of them for every scan token. If you don't think it's too pressing, we can forego that. http://gerrit.cloudera.org:8080/#/c/5840/1/common/thrift/generate_error_codes.py File common/thrift/generate_error_codes.py: PS1, Line 314: is type nit: is of type -- To view, visit http://gerrit.cloudera.org:8080/5840 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6d43f5bb9811e728ad592933066d006c8fb4553a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes