accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: same token couldn't authenticate twice?
Date Fri, 12 Jun 2015 03:28:06 GMT
Are you sure that the spark tasks have the proper ClientConfiguration? 
They need to have instance.rpc.sasl.enabled. I believe you should be 
able to set this via the AccumuloInputFormat

You can turn up logging org.apache.accumulo.core.client=TRACE and/or set 
the system property -Dsun.security.krb5.debug=true to get some more 
information as to why the authentication is failing.

Xu (Simon) Chen wrote:
> Josh,
>
> I am using this function:
>
> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L106
>
> If I pass in a KerberosToken, it's stuck at line 111; if I pass in a
> delegation token, the setConnectorInfo function finishes fine.
>
> But when I do something like queryRDD.count, spark eventually calls
> HadoopRDD.getPartitions, which calls the following and get stuck in
> the last authenticate() function:
> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L621
> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L348
> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ZooKeeperInstance.java#L248
> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java#L70
>
> Which essentially the same place where it would be stuck with KerberosToken.
>
> -Simon
>
> On Thu, Jun 11, 2015 at 9:41 PM, Josh Elser<josh.elser@gmail.com>  wrote:
>> What are the Accumulo methods that you are calling and what is the error you
>> are seeing?
>>
>> A KerberosToken cannot be used in a MapReduce job which is why a
>> DelegationToken is automatically retrieved. You should still be able to
>> provide your own DelegationToken -- if that doesn't work, that's a bug.
>>
>> Xu (Simon) Chen wrote:
>>> I actually added a flag such that I can pass in either a KerberosToken
>>> or a DelegationTokenImpl to accumulo.
>>>
>>> Actually when a KerberosToken is passed in, accumulo converts it to a
>>> DelegationToken - the conversion is where I am having trouble. I tried
>>> passing in a delegation token directly to bypass the conversion, but a
>>> similar problem happens, that I am stuck at authenticate on the client
>>> side, and server side outputs the same output...
>>>
>>> On Thursday, June 11, 2015, Josh Elser<josh.elser@gmail.com
>>> <mailto:josh.elser@gmail.com>>  wrote:
>>>
>>>      Keep in mind that the authentication path for DelegationToken
>>>      (mapreduce) and KerberosToken are completely different.
>>>
>>>      Since most mapreduce jobs have multiple mappers (or reducers), I
>>>      expect we would have run into the case that the same DelegationToken
>>>      was used multiple times. It would still be good to narrow down the
>>>      scope of the problem.
>>>
>>>      Xu (Simon) Chen wrote:
>>>
>>>          Thanks Josh...
>>>
>>>          I tested this in scala REPL, and called
>>>          DataStoreFinder.getDataStore()
>>>          multiple times, each time it seems to be reusing the same
>>>          KerberosToken object, and it works fine each time.
>>>
>>>          So my problem only happens when the token is used in accumulo's
>>>          mapred
>>>          package. Weird..
>>>
>>>          -Simon
>>>
>>>
>>>          On Thu, Jun 11, 2015 at 5:29 PM, Josh
>>>          Elser<josh.elser@gmail.com>   wrote:
>>>
>>>              Simon,
>>>
>>>              Can you reproduce this in plain-jane Java code? I don't know
>>>              enough about
>>>              spark/scala, much less what Geomesa is actually do, to know
>>>              what the issue
>>>              is.
>>>
>>>              Also, which token are you referring to: A KerberosToken or a
>>>              DelegationToken? Either of them should be usable as many
>>>              times as you'd like
>>>              (given the underlying credentials are still available for KT
>>>              or the DT token
>>>              hasn't yet expired).
>>>
>>>
>>>              Xu (Simon) Chen wrote:
>>>
>>>                  Folks,
>>>
>>>                  I am working on geomesa+accumulo+spark integration. For
>>>                  some reason, I
>>>                  found that the same token cannot be used to authenticate
>>>                  twice.
>>>
>>>                  The workflow is that geomesa would try to create a
>>>                  hadoop rdd, during
>>>                  which it tries to create an AccumuloDataStore:
>>>
>>>
>>> https://github.com/locationtech/geomesa/blob/master/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L81
>>>
>>>                  During this process, a ZooKeeperInstance is created:
>>>
>>>
>>> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-core/src/main/scala/org/locationtech/geomesa/core/data/AccumuloDataStoreFactory.scala#L177
>>>                  I modified geomesa such that it would use kerberos to
>>>                  authenticate
>>>                  here. This step works fine.
>>>
>>>                  But next, geomesa calls
>>> ConfigurationBase.setConnectorInfo:
>>>
>>>
>>> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L69
>>>
>>>                  This is using the same token and the same zookeeper URI,
>>>                  for some
>>>                  reason it is stuck on spark-shell, and the following is
>>>                  outputted on
>>>                  tserver side:
>>>
>>>                  2015-06-06 18:58:19,616 [server.TThreadPoolServer]
>>>                  ERROR: Error
>>>                  occurred during processing of message.
>>>                  java.lang.RuntimeException:
>>>                  org.apache.thrift.transport.TTransportException:
>>>                  java.net<http://java.net>.SocketTimeoutException: Read
>>>
>>>                  timed out
>>>                             at
>>>
>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
>>>                             at
>>>
>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
>>>                             at
>>>
>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
>>>                             at
>>>                  java.security.AccessController.doPrivileged(Native Method)
>>>                             at
>>>                  javax.security.auth.Subject.doAs(Subject.java:356)
>>>                             at
>>>
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1622)
>>>                             at
>>>
>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
>>>                             at
>>>
>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
>>>                             at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>                             at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>                             at
>>>
>>> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>>>                             at java.lang.Thread.run(Thread.java:745)
>>>                  Caused by:
>>> org.apache.thrift.transport.TTransportException:
>>>                  java.net<http://java.net>.SocketTimeoutException: Read
>>>                  timed out
>>>                             at
>>>
>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
>>>                             at
>>>
>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>>                             at
>>>
>>> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182)
>>>                             at
>>>
>>> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
>>>                             at
>>>
>>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
>>>                             at
>>>
>>> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
>>>                             at
>>>
>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
>>>                             ... 11 more
>>>                  Caused by: java.net
>>>                  <http://java.net>.SocketTimeoutException: Read timed out
>>>                             at
>>>                  java.net.SocketInputStream.socketRead0(Native Method)
>>>                             at
>>>
>>> java.net.SocketInputStream.read(SocketInputStream.java:152)
>>>                             at
>>>
>>> java.net.SocketInputStream.read(SocketInputStream.java:122)
>>>                             at
>>>
>>> java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
>>>                             at
>>>
>>> java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>>>                             at
>>>
>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
>>>                             ... 17 more
>>>
>>>                  Any idea why?
>>>
>>>                  Thanks.
>>>                  -Simon
>>>

Mime
View raw message