Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1873B18A16 for ; Fri, 12 Jun 2015 18:35:47 +0000 (UTC) Received: (qmail 91953 invoked by uid 500); 12 Jun 2015 18:35:46 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 91903 invoked by uid 500); 12 Jun 2015 18:35:46 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 91893 invoked by uid 99); 12 Jun 2015 18:35:46 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jun 2015 18:35:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 658E0CD89A for ; Fri, 12 Jun 2015 18:35:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.901 X-Spam-Level: X-Spam-Status: No, score=0.901 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id uSDt55whJJnF for ; Fri, 12 Jun 2015 18:35:37 +0000 (UTC) Received: from mail-vn0-f43.google.com (mail-vn0-f43.google.com [209.85.216.43]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id D809621664 for ; Fri, 12 Jun 2015 18:35:36 +0000 (UTC) Received: by vnbg129 with SMTP id g129so7326481vnb.11 for ; Fri, 12 Jun 2015 11:34:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=NBdAGmIsTv0aN/k/Ek5sQsinCo+Xt77lcptYplR5wqU=; b=YvdUHqlOZiSDOjq5ARcJs8E6YqKmwxq+PqyTFuznAv7vYKrENgPaiFTb9ZgkLjxvdT QDIuD40MpuWW2zKLN8BnIVrL0Zha0PENtnS/25PC1k1Wofs/ULO/znHiqbM+0SJIaFOZ lJIXyBSiaXtBHVfI3UoehCOEWOv16ZiSxMpJMIMFNkozCWl7wlZbDDt4E/Ni/I4daqyI 5WV6bNpo/5fhbl7DTIuEZ5Wd7etNM2RK7G73S7nrrS91nPJMBWgKuWlNKIitgN/x9vRm mmwS++YAMGvwHf51o0sN5HtWeTFePKCibYJ6QoyrOyU5t/wrTX3Os5WiqUFeJfFYHmZk DGaw== MIME-Version: 1.0 X-Received: by 10.52.171.199 with SMTP id aw7mr29000195vdc.87.1434134084797; Fri, 12 Jun 2015 11:34:44 -0700 (PDT) Received: by 10.52.190.230 with HTTP; Fri, 12 Jun 2015 11:34:44 -0700 (PDT) In-Reply-To: <557B1B19.4050607@gmail.com> References: <5579FDC5.5020006@gmail.com> <557A30A0.6060601@gmail.com> <557A38B5.90504@gmail.com> <557A51C6.3030603@gmail.com> <557AFBF3.1040108@gmail.com> <557B1B19.4050607@gmail.com> Date: Fri, 12 Jun 2015 14:34:44 -0400 Message-ID: Subject: Re: same token couldn't authenticate twice? From: "Xu (Simon) Chen" To: user@accumulo.apache.org Content-Type: text/plain; charset=UTF-8 I did something like AccumuloInputFormat.setZooKeeperInstance(Job, ClientConfiguration.loadDefault().withZKHosts(zk).withInstance(name).withSasl(true)) So, this explicitly instruct to turn on SASL, and it's working past where I was stuck on. Now I seem to have a different problem :-( I'll look into it and report back later. Thanks Josh! -Simon On Fri, Jun 12, 2015 at 1:47 PM, Josh Elser wrote: > Generally, it's not a good idea to assume that you can always locate the > correct ClientConfiguration from the local filesystem. > > For example, YARN nodes might not have Accumulo installed, or might even > point to the wrong Accumulo instance. The Job's Configuration serves as the > de-facto place for _all_ information that your Job needs to perform its > work. > > Can you try calling AccumuloInputFormat.setZooKeeperInstance(Job, > ClientConfiguration) instead? > > Great work tracking this down, Simon! > > > Xu (Simon) Chen wrote: >> >> Ah, I found the problem... >> >> In the hadoop chain of events, this is called eventually, because >> clientConfigString is not null (containing instance.name and >> instance.zookeeper.host): >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L382 >> >> The deserialize function unfortunately doesn't load the default >> options, therefore left out the sasl thing from ~/.accumulo/config >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java#L235 >> >> Would it be reasonable for deserialize to load the default settings? >> >> -Simon >> >> >> On Fri, Jun 12, 2015 at 11:34 AM, Josh Elser wrote: >>> >>> Just be careful with the mapreduce classes. I wouldn't be surprised if we >>> try to avoid any locally installed client.conf in MapReduce (using only >>> the >>> ClientConfiguration stored inside the Job). >>> >>> Will wait to hear back from you :) >>> >>> Xu (Simon) Chen wrote: >>>> >>>> Emm.. I have ~/.accumulo/config with "instance.rpc.sasl.enabled=true". >>>> That property is indeed populated to ClientConfiguration the first time >>>> - that's why I said the token worked initially. >>>> >>>> Apparently, in the Hadoop portion that property is not set, as I added >>>> some debug message to ZooKeeperInstance class. I think that's likely the >>>> issue. >>>> >>>> So the zookeeper instance is created in the following sequence: >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L341 >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/InputConfigurator.java#L671 >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L361 >>>> >>>> The getClientConfiguration function calls getDefaultSearchPath() >>>> eventually, so my ~/.accumulo/config should be searched. I think we are >>>> close to the root cause... Will update when I find out more. >>>> >>>> Thanks! >>>> -Simon >>>> >>>> On Thu, Jun 11, 2015 at 11:28 PM, Josh Elser >>>> wrote: >>>> > Are you sure that the spark tasks have the proper >>>> ClientConfiguration? They >>>> > need to have instance.rpc.sasl.enabled. I believe you should be >>>> able >>>> to set >>>> > this via the AccumuloInputFormat >>>> > >>>> > You can turn up logging org.apache.accumulo.core.client=TRACE >>>> and/or >>>> set the >>>> > system property -Dsun.security.krb5.debug=true to get some more >>>> information >>>> > as to why the authentication is failing. >>>> > >>>> > >>>> > Xu (Simon) Chen wrote: >>>> >> >>>> >> Josh, >>>> >> >>>> >> I am using this function: >>>> >> >>>> >> >>>> >> >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L106 >>>> >> >>>> >> If I pass in a KerberosToken, it's stuck at line 111; if I pass in >>>> a >>>> >> delegation token, the setConnectorInfo function finishes fine. >>>> >> >>>> >> But when I do something like queryRDD.count, spark eventually >>>> calls >>>> >> HadoopRDD.getPartitions, which calls the following and get stuck >>>> in >>>> >> the last authenticate() function: >>>> >> >>>> >> >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L621 >>>> >> >>>> >> >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L348 >>>> >> >>>> >> >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ZooKeeperInstance.java#L248 >>>> >> >>>> >> >>>> >>>> >>>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java#L70 >>>> >> >>>> >> Which essentially the same place where it would be stuck with >>>> >> KerberosToken. >>>> >> >>>> >> -Simon >>>> >> >>>> >> On Thu, Jun 11, 2015 at 9:41 PM, Josh Elser >>>> wrote: >>>> >>> >>>> >>> What are the Accumulo methods that you are calling and what is >>>> the >>>> error >>>> >>> you >>>> >>> are seeing? >>>> >>> >>>> >>> A KerberosToken cannot be used in a MapReduce job which is why a >>>> >>> DelegationToken is automatically retrieved. You should still be >>>> able >>>> to >>>> >>> provide your own DelegationToken -- if that doesn't work, that's >>>> a >>>> bug. >>>> >>> >>>> >>> Xu (Simon) Chen wrote: >>>> >>>> >>>> >>>> I actually added a flag such that I can pass in either a >>>> KerberosToken >>>> >>>> or a DelegationTokenImpl to accumulo. >>>> >>>> >>>> >>>> Actually when a KerberosToken is passed in, accumulo converts it >>>> to >>>> a >>>> >>>> DelegationToken - the conversion is where I am having trouble. I >>>> tried >>>> >>>> passing in a delegation token directly to bypass the conversion, >>>> but >>>> a >>>> >>>> similar problem happens, that I am stuck at authenticate on the >>>> client >>>> >>>> side, and server side outputs the same output... >>>> >>>> >>>> >>>> On Thursday, June 11, 2015, Josh Elser>>> >>>> > wrote: >>>> >>>> >>>> >>>> Keep in mind that the authentication path for >>>> DelegationToken >>>> >>>> (mapreduce) and KerberosToken are completely different. >>>> >>>> >>>> >>>> Since most mapreduce jobs have multiple mappers (or >>>> reducers), >>>> I >>>> >>>> expect we would have run into the case that the same >>>> >>>> DelegationToken >>>> >>>> was used multiple times. It would still be good to narrow >>>> down the >>>> >>>> scope of the problem. >>>> >>>> >>>> >>>> Xu (Simon) Chen wrote: >>>> >>>> >>>> >>>> Thanks Josh... >>>> >>>> >>>> >>>> I tested this in scala REPL, and called >>>> >>>> DataStoreFinder.getDataStore() >>>> >>>> multiple times, each time it seems to be reusing the >>>> same >>>> >>>> KerberosToken object, and it works fine each time. >>>> >>>> >>>> >>>> So my problem only happens when the token is used in >>>> accumulo's >>>> >>>> mapred >>>> >>>> package. Weird.. >>>> >>>> >>>> >>>> -Simon >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jun 11, 2015 at 5:29 PM, Josh >>>> >>>> Elser wrote: >>>> >>>> >>>> >>>> Simon, >>>> >>>> >>>> >>>> Can you reproduce this in plain-jane Java code? I >>>> don't >>>> >>>> know >>>> >>>> enough about >>>> >>>> spark/scala, much less what Geomesa is actually do, >>>> to know >>>> >>>> what the issue >>>> >>>> is. >>>> >>>> >>>> >>>> Also, which token are you referring to: A >>>> KerberosToken or >>>> >>>> a >>>> >>>> DelegationToken? Either of them should be usable as >>>> many >>>> >>>> times as you'd like >>>> >>>> (given the underlying credentials are still >>>> available >>>> for >>>> >>>> KT >>>> >>>> or the DT token >>>> >>>> hasn't yet expired). >>>> >>>> >>>> >>>> >>>> >>>> Xu (Simon) Chen wrote: >>>> >>>> >>>> >>>> Folks, >>>> >>>> >>>> >>>> I am working on geomesa+accumulo+spark >>>> integration. For >>>> >>>> some reason, I >>>> >>>> found that the same token cannot be used to >>>> >>>> authenticate >>>> >>>> twice. >>>> >>>> >>>> >>>> The workflow is that geomesa would try to >>>> create a >>>> >>>> hadoop rdd, during >>>> >>>> which it tries to create an AccumuloDataStore: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> https://github.com/locationtech/geomesa/blob/master/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L81 >>>> >>>> >>>> >>>> During this process, a ZooKeeperInstance is >>>> created: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-core/src/main/scala/org/locationtech/geomesa/core/data/AccumuloDataStoreFactory.scala#L177 >>>> >>>> I modified geomesa such that it would use >>>> kerberos >>>> to >>>> >>>> authenticate >>>> >>>> here. This step works fine. >>>> >>>> >>>> >>>> But next, geomesa calls >>>> >>>> ConfigurationBase.setConnectorInfo: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L69 >>>> >>>> >>>> >>>> This is using the same token and the same >>>> zookeeper >>>> >>>> URI, >>>> >>>> for some >>>> >>>> reason it is stuck on spark-shell, and the >>>> following is >>>> >>>> outputted on >>>> >>>> tserver side: >>>> >>>> >>>> >>>> 2015-06-06 18:58:19,616 >>>> [server.TThreadPoolServer] >>>> >>>> ERROR: Error >>>> >>>> occurred during processing of message. >>>> >>>> java.lang.RuntimeException: >>>> >>>> >>>> org.apache.thrift.transport.TTransportException: >>>> >>>> java.net >>>> .SocketTimeoutException: Read >>>> >>>> >>>> >>>> >>>> timed out >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) >>>> >>>> at >>>> >>>> >>>> java.security.AccessController.doPrivileged(Native >>>> >>>> Method) >>>> >>>> at >>>> >>>> >>>> javax.security.auth.Subject.doAs(Subject.java:356) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1622) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) >>>> >>>> at >>>> java.lang.Thread.run(Thread.java:745) >>>> >>>> Caused by: >>>> >>>> org.apache.thrift.transport.TTransportException: >>>> >>>> java.net >>>> .SocketTimeoutException: Read >>>> >>>> >>>> timed out >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) >>>> >>>> ... 11 more >>>> >>>> Caused by: java.net >>>> >>>> .SocketTimeoutException: Read timed >>>> >>>> out >>>> >>>> at >>>> >>>> java.net.SocketInputStream.socketRead0(Native >>>> Method) >>>> >>>> at >>>> >>>> >>>> >>>> java.net.SocketInputStream.read(SocketInputStream.java:152) >>>> >>>> at >>>> >>>> >>>> >>>> java.net.SocketInputStream.read(SocketInputStream.java:122) >>>> >>>> at >>>> >>>> >>>> >>>> java.io.BufferedInputStream.read1(BufferedInputStream.java:273) >>>> >>>> at >>>> >>>> >>>> >>>> java.io.BufferedInputStream.read(BufferedInputStream.java:334) >>>> >>>> at >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >>>> >>>> ... 17 more >>>> >>>> >>>> >>>> Any idea why? >>>> >>>> >>>> >>>> Thanks. >>>> >>>> -Simon >>>> >>>> >>>> > >>>> >