Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F3C0418400 for ; Wed, 10 Jun 2015 05:53:58 +0000 (UTC) Received: (qmail 76896 invoked by uid 500); 10 Jun 2015 05:53:58 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 76818 invoked by uid 500); 10 Jun 2015 05:53:58 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 76802 invoked by uid 99); 10 Jun 2015 05:53:58 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Jun 2015 05:53:58 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 0BDEF1DF966; Wed, 10 Jun 2015 05:53:57 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============0053972968410047403==" MIME-Version: 1.0 Subject: Re: Review Request 35107: HIVE-6791 Support variable substition for Beeline shell command From: "cheng xu" To: chinnarao@huawei.com, "Xuefu Zhang" Cc: "cheng xu" , "hive" Date: Wed, 10 Jun 2015 05:53:57 -0000 Message-ID: <20150610055357.27736.60107@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "cheng xu" X-ReviewGroup: hive X-ReviewRequest-URL: https://reviews.apache.org/r/35107/ X-Sender: "cheng xu" References: <20150609044133.27736.70935@reviews.apache.org> In-Reply-To: <20150609044133.27736.70935@reviews.apache.org> Reply-To: "cheng xu" X-ReviewRequest-Repository: hive-git --===============0053972968410047403== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On June 9, 2015, 12:41 p.m., Xuefu Zhang wrote: > > Besides the two minor issues I found in the patch, I was wondering if the approach we are taking is the best. Variable substitution is a server (HS2) behavior, and on this ground I think this should happen in HS2 instead of beeline. Please note that JDBC client may also submit queries with $var in it, and such a case should be also supported. > > > > I also noticed that in Driver class, there is code handling variable substitution. I'm wondering why it's not effective. > > > > Shell command (starting with !sh) is executed in the client (Beeline). I think we are fine if variable substituion doesn't work for shell command. We can address that as a followup taks if desirable. > > cheng xu wrote: > Thanks for your comments. > > `I also noticed that in Driver class, there is code handling variable substitution. I'm wondering why it's not effective.` > > The substitution works well in HS2 currently. > There are two reasons for me to add API getting the conf from HS2. One is to support substitution in sh and source command. In the old cli, source command and sh command worked well with substitution. So this part of this patch is addressing this purpose. Another consideration is for https://issues.apache.org/jira/browse/HIVE-10847 which required some configuration from hive-site.xml. > > Xuefu Zhang wrote: > Yeah. It's a little trickier than thought. Shell command is executed at client side (Beeline) and it doesn't seem making sense to use server specific variables such (env, sys, hiveconf, hivevar) in the shell commands. More importantly, Beeline can connect to multiple serves at the same time, so which configurations should be used to substitue the variables? User should be able to execute shell commands w/o any server connection. > > For CLI, server and client are together, so these don't matter. But for beeline + HS2 deployment, story will be different. > > I don't know what's the best, and all I'm saying is that we need to be very careful on what we doing. Before we decide what to do, we need to clearly define the problem we are trying to solve first. > > cheng xu wrote: > Thank you for your prompt reply. > `Shell command is executed at client side (Beeline) and it doesn't seem making sense to use server specific variables such (env, sys, hiveconf, hivevar) in the shell commands.` > I'm not sure whether substitution for sh and source is useful. We can enable the support of substitution after connection established for beeline unless connected. For the new CLI who is using an embedded connection, it should be supported WRT the backwards compatibility. > > I am a little confused about `connect to multiple serves at the same time`. Does it mean you can use beeline to connect any server in one connection and you can have multiple beeline instances running? (It's the case that user executes the command */opt/apache-hive-1.2.0-SNAPSHOT-bin/bin/hive --service beeline* with specify any hostname) > If so, I think it may cause some errors if no connection available since the current implementation is based on connection by using **SetProcessor**. AFAIK, it's safe to get the configurations from HS2 via **SetProcessor** which is what beeline actually did after connection is established. Connection(session) should only be assiocated with one server. If user didn't connect to any HS2, the substitution for *sh* and *source* should be disabled. To be honest, it will have some negative impacts for the performance since it requires to execute set command. WRT the performance, we can make this support configurable. > > In summary, substitution is enabled unless connection is established for source or sh command considering the backwards compatibility. And we can disable the support for beeline if not reasonable or brings lower performce. For HIVE-10847, I think we still need one way to access the configuration from server side but it is only needed when start a connection. > > Any thoughts? Sorry for below typo. I am a little confused about connect to multiple serves at the same time. Does it mean you can use beeline to connect any server in one connection and you can have multiple beeline instances running? (It's the case that user executes the command /opt/apache-hive-1.2.0-SNAPSHOT-bin/bin/hive --service beeline **without** specify any hostname) - cheng ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35107/#review87090 ----------------------------------------------------------- On June 5, 2015, 10:09 a.m., cheng xu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/35107/ > ----------------------------------------------------------- > > (Updated June 5, 2015, 10:09 a.m.) > > > Review request for hive, chinna and Xuefu Zhang. > > > Bugs: HIVE-6791 > https://issues.apache.org/jira/browse/HIVE-6791 > > > Repository: hive-git > > > Description > ------- > > Summary: > 1) move the beeline-cli convertor to the place where cli is executed(class **Commands**) > 2) support substitution for source command > 3) add some unit test for substitution > 4) add one way to get the configuration from HS2 > > > Diffs > ----- > > beeline/src/java/org/apache/hive/beeline/BeeLine.java 45a7e87 > beeline/src/java/org/apache/hive/beeline/BeelineVariableSubstitution.java PRE-CREATION > beeline/src/java/org/apache/hive/beeline/Commands.java a42baa3 > beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java 6cbb030 > > Diff: https://reviews.apache.org/r/35107/diff/ > > > Testing > ------- > > Unit test passed > > > Thanks, > > cheng xu > > --===============0053972968410047403==--