From issues-return-201264-archive-asf-public=cust-asf.ponee.io@hive.apache.org Mon Nov 9 17:05:04 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 1711918065C for ; Mon, 9 Nov 2020 18:05:04 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id 2FBAF64E8E for ; Mon, 9 Nov 2020 17:05:02 +0000 (UTC) Received: (qmail 43778 invoked by uid 500); 9 Nov 2020 17:05:01 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 43679 invoked by uid 99); 9 Nov 2020 17:05:01 -0000 Received: from ec2-52-204-25-47.compute-1.amazonaws.com (HELO mailrelay1-ec2-va.apache.org) (52.204.25.47) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Nov 2020 17:05:01 +0000 Received: from jira2-he-de.apache.org (static.54.33.119.168.clients.your-server.de [168.119.33.54]) by mailrelay1-ec2-va.apache.org (ASF Mail Server at mailrelay1-ec2-va.apache.org) with ESMTPS id 26925417D9 for ; Mon, 9 Nov 2020 17:05:01 +0000 (UTC) Received: from jira2-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira2-he-de.apache.org (ASF Mail Server at jira2-he-de.apache.org) with ESMTP id 56FBEC8063A for ; Mon, 9 Nov 2020 17:05:00 +0000 (UTC) Date: Mon, 9 Nov 2020 17:05:00 +0000 (UTC) From: "ASF GitHub Bot (Jira)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Work logged] (HIVE-24230) Integrate HPL/SQL into HiveServer2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-24230?focusedWorklogId=3D= 509259&page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpa= nel#worklog-509259 ] ASF GitHub Bot logged work on HIVE-24230: ----------------------------------------- Author: ASF GitHub Bot Created on: 09/Nov/20 17:04 Start Date: 09/Nov/20 17:04 Worklog Time Spent: 10m=20 Work Description: zeroflag commented on a change in pull request #163= 3: URL: https://github.com/apache/hive/pull/1633#discussion_r519973087 ########## File path: service/src/java/org/apache/hive/service/cli/operation/ExecuteSt= atementOperation.java ########## @@ -45,6 +62,21 @@ public static ExecuteStatementOperation newExecuteStatem= entOperation(HiveSession throws HiveSQLException { =20 String cleanStatement =3D HiveStringUtils.removeComments(statement); + if (!HPLSQL.equals(confOverlay.get(QUERY_EXECUTOR)) && hplSqlMode()) { + if (SessionState.get().getHplsqlInterpreter() =3D=3D null) { + Exec interpreter =3D new Exec( + new Conf(), + new BeelineConsole(), + ResultListener.NONE, + new HplSqlQueryExecutor(parentSession), + parentSession.getMetaStoreClient(), + new HiveHplSqlSessionState(SessionState.get()) Review comment: @mustafaiman, I added a unittest to verify it and looks like it still works this way. = HiveHplSqlSessionState holds a reference to SessionState which changes when= the the current db is modified. The reason why the HiveHplSqlSessionState = is still needed is because the SessionState is in the ql project which cann= ot be referenced from hplsql without introducing circular dependency. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 509259) Time Spent: 0.5h (was: 20m) > Integrate HPL/SQL into HiveServer2 > ---------------------------------- > > Key: HIVE-24230 > URL: https://issues.apache.org/jira/browse/HIVE-24230 > Project: Hive > Issue Type: Bug > Components: HiveServer2, hpl/sql > Reporter: Attila Magyar > Assignee: Attila Magyar > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > HPL/SQL is a standalone command line program that can store and load scri= pts from text files, or from Hive Metastore (since HIVE-24217). Currently H= PL/SQL depends on Hive and not the other way around. > Changing the dependency order between HPL/SQL and HiveServer would open u= p some possibilities which are currently not feasable to implement. For exa= mple one might want to use a third party SQL tool to run selects on stored = procedure (or rather function in this case) outputs. > {code:java} > SELECT * from myStoredProcedure(1, 2); {code} > HPL/SQL doesn=E2=80=99t have a JDBC interface and it=E2=80=99s not a daem= on so this would not work with the current architecture. > Another important factor is performance. Declarative SQL commands are sen= t to Hive via JDBC by HPL/SQL. The integration would make it possible to dr= op JDBC and use HiveSever=E2=80=99s internal API for compilation and execut= ion. > The third factor is that existing tools like Beeline or Hue cannot be use= d with HPL/SQL since it has its own, separated CLI. > =C2=A0 > To make it easier to implement, we keep things separated in the inside at= first, by introducing a hive session level JDBC parameter. > {code:java} > jdbc:hive2://localhost:10000/default;hplsqlMode=3Dtrue {code} > =C2=A0 > The hplsqlMode indicates that we are in procedural SQL mode where the use= r can create and call stored procedures. HPLSQL allows you to write any kin= d of procedural statement at the top level. This patch doesn't limit this b= ut it might be better to eventually restrict what statements are allowed ou= tside of stored procedures. > =C2=A0 > Since HPLSQL and Hive are running in the same process there is no need to= use the JDBC driver between them. The patch adds an abstraction with 2 dif= ferent implementations, one for executing queries on JDBC (for keeping the = existing behaviour) and another one for directly calling Hive's compiler. I= n HPLSQL mode the latter is used. > In the inside a new operation (HplSqlOperation) and operation type=C2=A0(= PROCEDURAL_SQL) was added which works similar to the SQLOperation but it us= es the hplsql interpreter to execute arbitrary scripts. This operation migh= t spawns new SQLOpertions. > For example consider the following statement: > {code:java} > FOR i in 1..10 LOOP =C2=A0=20 > SELECT * FROM table=20 > END LOOP;{code} > We send this to beeline while we'er in hplsql mode. Hive will create a hp= lsql interpreter and store it in the session state. A new=C2=A0HplSqlOperat= ion is created to run the script on the interpreter. > HPLSQL knows how to execute the for loop, but i'll call Hive to run the s= elect expression. The HplSqlOperation is notified when the select reads a r= ow and accumulates the rows into a=C2=A0RowSet (memory consumption need to = be considered here) which can be retrieved via thrift from the client side. > =C2=A0 -- This message was sent by Atlassian Jira (v8.3.4#803005)