From issues-return-50490-archive-asf-public=cust-asf.ponee.io@drill.apache.org Wed Mar 7 02:59:06 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id CF186180676 for ; Wed, 7 Mar 2018 02:59:05 +0100 (CET) Received: (qmail 33800 invoked by uid 500); 7 Mar 2018 01:59:04 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 33777 invoked by uid 99); 7 Mar 2018 01:59:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Mar 2018 01:59:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 56231C695B for ; Wed, 7 Mar 2018 01:59:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.511 X-Spam-Level: X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id bTKMd-uigUEZ for ; Wed, 7 Mar 2018 01:59:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 8C75C5F4A8 for ; Wed, 7 Mar 2018 01:59:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 61915E030D for ; Wed, 7 Mar 2018 01:59:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 702F8253FD for ; Wed, 7 Mar 2018 01:59:00 +0000 (UTC) Date: Wed, 7 Mar 2018 01:59:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-6187) Exception in RPC communication between DataClient/ControlClient and respective servers when bit-to-bit security is on MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-6187?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1638= 8877#comment-16388877 ]=20 ASF GitHub Bot commented on DRILL-6187: --------------------------------------- Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1145#discussion_r172678133 =20 --- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/BasicClient.= java --- @@ -182,6 +196,66 @@ public boolean isActive() { =20 protected abstract void validateHandshake(HR validateHandshake) thro= ws RpcException; =20 + /** + * Creates various instances needed to start the SASL handshake. Thi= s is called from + * {@link BasicClient#validateHandshake(MessageLite)} if authenticat= ion is required from server side. + * @param connectionListener + * @throws RpcException + */ + protected abstract void prepareSaslHandshake(final RpcConnectionHand= ler connectionListener) throws RpcException; --- End diff -- =20 None of the implementations seems to throw an RpcException. > Exception in RPC communication between DataClient/ControlClient and respe= ctive servers when bit-to-bit security is on > -------------------------------------------------------------------------= -------------------------------------------- > > Key: DRILL-6187 > URL: https://issues.apache.org/jira/browse/DRILL-6187 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC, Security > Reporter: Sorabh Hamirwasia > Assignee: Sorabh Hamirwasia > Priority: Major > Fix For: 1.13.0 > > > =C2=A0 > {color:#000000}Below is the summary of issue:=C2=A0{color} > =C2=A0 > {color:#000000}*Scenario:*{color} > {color:#000000}It seems like first sendRecordBatch was sent to Foreman wh= ich initiated the Authentication handshake. But before initiating handshake= for auth we establish a connection and store that in a registry. Now if in= parallel there is another recordBatch (by a different minor fragment runni= ng on same Drillbit) to be sent then that will see the connection available= in registry and will initiate the send. Before the authentication is compl= eted this second request reached foreman and it throws below exception sayi= ng RPC type 3 message is not allowed and closes the connection. This also f= ails the authentication handshake which was in progress.{color}{color:#0000= 00}=C2=A0Here the logs with details:{color} > {color:#000000}=C2=A0{color} > {color:#000000}*Forman received the SASL_START message from another node:= *{color} > {color:#000000}*_2018-02-21 18:43:30,759 [_*{color}{color:#000000}_BitSer= ver-4] TRACE o.a.d.e.r.s.ServerAuthenticationHandler - Received SASL messag= e SASL_START from /10.10.100.161:35482_{color} > {color:#000000}=C2=A0{color} > {color:#000000}*Then=C2=A0around same time=C2=A0it received another messa= ge from client of Rpc Type 3 which is for SendRecordBatch and fails since h= andshake is not completed yet.*{color} > {color:#000000}=C2=A0{color} > {color:#000000}*_2018-02-21 18:43:30,762_*{color}{color:#000000}=C2=A0_[B= itServer-4] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC com= munication.=C2=A0 Connection: /10.10.100.162:31012 <--> /__10.10.100.161:35= 482_ _(data server).=C2=A0 Closing connection._{color} > {color:#000000}_io.netty.handler.codec.DecoderException: org.apache.drill= .exec.rpc.RpcException: Request of type 3 is not allowed without authentica= tion. Client on /__10.10.100.161:35482_ _must authenticate before making re= quests. Connection dropped. [Details: Encryption: enabled , MaxWrappedSize:= 65536 , WrapSizeLimit: 0]_{color} > {color:#000000}=C2=A0{color} > {color:#000000}*Then client receives an channel closed exception:*{color} > {color:#000000}=C2=A0{color} > {color:#000000}*2018-02-21 18:43:30,764 [*{color}{color:#000000}BitClient= -4] WARN=C2=A0 o.a.d.exec.rpc.RpcExceptionHandler - Exception occurred with= closed channel.=C2=A0 Connection: /_10.10.100.161:35482_ <--> _10.10.100.1= 62:31012_=C2=A0(data client){color} > {color:#000000}=C2=A0{color} > {color:#000000}*and due to this it's initial command for authentication a= lso fails. Since there is channel closed exception above I will think that = triggered the failure of authentication request as well.*{color} > {color:#000000}=C2=A0{color} > {color:#000000}_Caused by: org.apache.drill.exec.rpc.RpcException: Comman= d failed while establishing connection.=C2=A0 Failure type AUTHENTICATION._= {color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= RpcException.mapException(RpcException.java:67) ~[drill-rpc-1.12.0-mapr.jar= :1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= ListeningCommand.connectionFailed(ListeningCommand.java:66) ~[drill-rpc-1.1= 2.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= data.DataTunnel$SendBatchAsyncListen.connectionFailed(DataTunnel.java:166) = ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:2= 03) ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:1= 47) ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingCon= nection.java:122) ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= ReconnectingConnection.runCommand(ReconnectingConnection.java:83) ~[drill-r= pc-1.12.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.rpc.= data.DataTunnel._{color}{color:#000000}*_sendRecordBatch_*{color}{color:#00= 0000}_(DataTunnel.java:84) ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{= color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.ops.= AccountingDataTunnel.sendRecordBatch(AccountingDataTunnel.java:45) ~[drill-= java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0 =C2=A0 =C2=A0 =C2=A0 _at org.apache.drill.exec.phys= ical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCr= eator.java:127) ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color} > {color:#000000}=C2=A0{color} > {color:#000000}So I think there is a concurrency issue where even though = the authentication is not completed the other requests are send to remote n= ode as soon as TCP connection is available. Instead it should wait until au= thentication is completed. Something like TCP=C2=A0connection should be mad= e available from registry=C2=A0only if authentication is completed.{color} > =C2=A0 > =C2=A0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)