crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tahir Hameed <tah...@gmail.com>
Subject Re: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
Date Thu, 01 Oct 2015 10:32:31 GMT
Thanks for the feedback,

I tried the above method, but I'm using version 0.98.4.2.2.4.4-16-hadoop2
for HBase on the cluster, and version 0.12.0-hadoop2 for Apache Crunch.
I've tried using using 0.11.0-hadoop2 by applying a patch for CRUNCH-536
but I fall into other errors. I havent been able to find a git release
for 0.12.0-hadoop2
to add the CRUNCH-536 changes to it .

Also, I am using  already TableMapReduceUtil.initCredentials(mrJob.getJob())
in my own code as well for all tables I read . I read the table, and
convert that into a readable instance to be accessed into another DoFn.
With 2 pipelines, one of them working absolutely fine, (no errors) and the
other one having kerberos authentication errors. The only difference I see
is of use of PTable in one, and PGroupedTable in the other.

Would sharing the code for instance be more helpful in identifying the
problem?

Best,

Tahir

On Thu, Oct 1, 2015 at 8:58 AM, Gabriel Reid <gabriel.reid@gmail.com> wrote:
>
> If I'm reading that stack trace correctly, CEDoFn is reading from an
> HBase table in its initialize method (probably via a ReadableData)
> instance.
>
> It looks like the HBase instance is kerberized, which will mean that
> TableMapReduceUtil.initCredentials(Job) needs to be called before
> submitting the job.
>
> There was a relatively recent patch added in Crunch (see CRUNCH-536)
> to make it easier to add the call to
> TableMapReduceUtil.initCredentials. If you build a version of Crunch
> with CRUNCH-536 included, you should be able to add the following call
> during the setup of your pipeline:
>
>     pipeline.addPrepareHook(new CrunchControlledJob.Hook() {
>        @Override
>         public void run(MRJob mrJob) throws IOException {
>           TableMapReduceUtil.initCredentials(mrJob.getJob());
>         }
>      });
>
>
> - Gabriel
>
> On Wed, Sep 30, 2015 at 11:17 PM, Tahir Hameed <tahirh@gmail.com> wrote:
> > It is HDFS. The setup for both pipelines is the same too.
> >
> >
> >
> > On Wed, Sep 30, 2015 at 10:17 PM, Micah Whitacre <mkwhitacre@gmail.com>
> > wrote:
> >>
> >> What is the datastore you are reading from?  HBase? HDFS?  Also is
there
> >> any setup differences between the two pipelines?
> >>
> >> On Wed, Sep 30, 2015 at 3:13 PM, Tahir Hameed <tahirh@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am facing a queer problem. I have 2 MR pipelines. One of them is
> >>> working fine. The other is not.
> >>>
> >>> The difference lies in only one of the DoFN functions.
> >>>
> >>>
> >>> The DoFn function which fails is given below:
> >>>
> >>>     public PTable<ImmutableBytesWritable, CE>
> >>> myFunction(PTable<ImmutableBytesWritable, Pair<A, B>> joinedData,
> >>> PTable<String, C> others) {
> >>>
> >>>         ReadableData<Pair<String, C>> readable =
> >>> others.asReadable(false);
> >>>         ParallelDoOptions options = ParallelDoOptions.builder()
> >>>                 .sourceTargets(readable.getSourceTargets())
> >>>                 .build();
> >>>
> >>>         return joinedData
> >>>                 .by(someMapFunction,
> >>> Avros.writables(ImmutableBytesWritable.class))
> >>>                 .groupByKey()
> >>>                 .parallelDo("", new CEDoFN(readable,
> >>> others.getPTableType()),
> >>>
> >>> Avros.tableOf(Avros.writables(ImmutableBytesWritable.class),
> >>> Avros.reflects(CE.class)), options);
> >>>
> >>>     }
> >>>
> >>> The stack trace is as follows :
> >>>
> >>> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> >>> GSSException: No valid credentials provided (Mechanism level: Failed
to find
> >>> any Kerberos tgt)]
> >>>     at
> >>>
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
> >>>     at
> >>>
org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:943)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:940)
> >>>     at java.security.AccessController.doPrivileged(Native Method)
> >>>     at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>     at
> >>>
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:940)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1094)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.tracedWriteRequest(RpcClient.java:1061)
> >>>     at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1516)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1724)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1777)
> >>>     at
> >>>
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30373)
> >>>     at
> >>>
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1604)
> >>>     at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:768)
> >>>     at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:766)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
> >>>     at
org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:772)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.prefetchRegionCache(ConnectionManager.java:1254)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1318)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1167)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:294)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:130)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:55)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:201)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)
> >>>     at
org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:802)
> >>>     at
> >>>
org.apache.crunch.io.hbase.HTableIterator.<init>(HTableIterator.java:47)
> >>>     at
> >>>
org.apache.crunch.io.hbase.HTableIterable.iterator(HTableIterable.java:43)
> >>>     at
> >>>
org.apache.crunch.util.DelegatingReadableData$1.iterator(DelegatingReadableData.java:63)
> >>>     at
com.bol.step.enrichmentdashboard.fn.CEDoFN.initialize(CEDoFN.java:45)
> >>>     at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:71)
> >>>     at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:73)
> >>>     at
> >>>
org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:44)
> >>>     at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
> >>>     at
> >>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
> >>>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> >>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> >>>     at java.security.AccessController.doPrivileged(Native Method)
> >>>     at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>     at
> >>>
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> >>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> >>> Caused by: GSSException: No valid credentials provided (Mechanism
level:
> >>> Failed to find any Kerberos tgt)
> >>>
> >>>
> >>> In the CEDoFunction, the readable is used in the initialization phase
to
> >>> create a HashMap. This is the place where the stack trace error also
points
> >>> to.
> >>>
> >>> In the function which succeeds, the parallelDo is performed directly
on
> >>> the joinedData which is also a PTable, and there are no errors. The
> >>> initialization phases for both functions are exactly the same.
> >>>
> >>> I fail to understand the cause of the errors because the underlying
> >>> implementations for the both PTable and PGroupedTable is the same
because
> >>> both seem to be extending the PCollectionImpl interface.
> >>>
> >>> Tahir
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >

Mime
View raw message