hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nkechi Achara <nkach...@googlemail.com>
Subject Re: hbase/spark - Delegation Token can be issued only with kerberos or web authentication
Date Wed, 23 Nov 2016 11:07:21 GMT
Hi Abel,

Apologies, but I have been quite busy. The version I am using is here:

https://github.com/barkhorn/SparkOnHBase.

The tgt does look valid all in all, so I will revert back to my thought
that there is an issue with the classpath on the submit.

Thanks

On 22 November 2016 at 10:14, Abel Fernández <mevsmyself@gmail.com> wrote:

> I think the tgt is not the problem, checking the logs I can see:
>
> 16/11/22 10:06:40 DEBUG [main] YarnSparkHadoopUtil: running as user: hbase
> 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: hadoop login
> 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: hadoop login commit
> 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: using kerberos
> user:hbase@COMPANY.CORP
> 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: Using user:
> "hbase@COMPANY.CORP" with name hbase@COMPANY.CORP
> 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: User entry:
> "hbase@COMPANY.CORP"
> 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: UGI
> loginUser:hbase@COMPANY.CORP (auth:KERBEROS)
> 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: PrivilegedAction
> as:hbase (auth:SIMPLE)
> from:org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(
> SparkHadoopUtil.scala:68)
> 16/11/22 10:06:40 DEBUG [TGT Renewer for hbase@COMPANY.CORP]
> UserGroupInformation: Found tgt Ticket (hex) =
> 0000: 61 82 01 61 30 82 01 5D   A0 03 02 01 05 A1 12 1B  a..a0..]........
> 0010: 10 53 41 4E 54 41 4E 44   45 52 55 4B 2E 43 4F 52  .COMPANY.COR
> 0020: 50 A2 25 30 23 A0 03 02   01 02 A1 1C 30 1A 1B 06  P.%0#.......0...
> 0030: 6B 72 62 74 67 74 1B 10   53 41 4E 54 41 4E 44 45
> ....
>
> Client Principal = hbase@COMPANY.CORP
> Server Principal = krbtgt/COMPANY.CORP@COMPANY.CORP
> Session Key = EncryptionKey: keyType=18 keyBytes (hex dump)=
> 0000: 2D 9D 67 F5 7C B4 15 17   AE DE BE A5 B9 2C 15 95  -.g..........,..
> 0010: E6 6B 1C 4A 02 A2 44 67   6D D2 16 36 4A DA 11 82  .k.J..Dgm..6J...
>
>
> Forwardable Ticket true
> Forwarded Ticket false
> Proxiable Ticket false
> Proxy Ticket false
> Postdated Ticket false
> Renewable Ticket true
> Initial Ticket true
> Auth Time = Tue Nov 22 03:39:05 CET 2016
> Start Time = Tue Nov 22 03:39:05 CET 2016
> End Time = Wed Nov 23 03:39:05 CET 2016
> Renew Till = Tue Nov 29 03:39:05 CET 2016
> Client Addresses  Null
> 16/11/22 10:06:40 DEBUG [TGT Renewer for hbase@COMPANY.CORP]
> UserGroupInformation: Current time is 1479805600691
> 16/11/22 10:06:40 DEBUG [TGT Renewer for hbase@COMPANY.CORP]
> UserGroupInformation: Next refresh is 1479851465000
>
> Is the retrofit version you are using public? We are using CDH 5.5.4 but
> with a backported version of hbase on spark from the latest code released
> on github.
>
> On Mon, 21 Nov 2016 at 21:11 Nkechi Achara <nkachara@googlemail.com>
> wrote:
>
> > I am still convinced that it could be due to class path issues but I
> might
> > be missing something.
> >
> > Just to make sure.... Have you checked the use of the principal / keytab
> > only on the driver only so you can make sure the tgt is valid.
> >
> > I am using the same config but with CDH 5.5.2, but I am using a retrofit
> of
> > cloudera labs hbase on spark.
> >
> > Thanks
> >
> > On 21 Nov 2016 5:32 p.m., "Abel Fernández" <mevsmyself@gmail.com> wrote:
> >
> > > I have included into the spark-submit and into all nodemanagers and
> > drivers
> > > the krb5.conf and the jaas.conf, but I am still having the same
> problem.
> > >
> > > I think the problem is this piece of code, it is trying to execute a
> > > function into the executors and for some reason, the executors cannot
> > get a
> > > valid credentials.
> > >
> > > /**
> > >  * A simple enrichment of the traditional Spark RDD foreachPartition.
> > >  * This function differs from the original in that it offers the
> > >  * developer access to a already connected Connection object
> > >  *
> > >  * Note: Do not close the Connection object.  All Connection
> > >  * management is handled outside this method
> > >  *
> > >  * @param rdd  Original RDD with data to iterate over
> > >  * @param f    Function to be given a iterator to iterate through
> > >  *             the RDD values and a Connection object to interact
> > >  *             with HBase
> > >  */
> > > def foreachPartition[T](rdd: RDD[T],
> > >                         f: (Iterator[T], Connection) => Unit):Unit = {
> > >   rdd.foreachPartition(
> > >     it => hbaseForeachPartition(broadcastedConf, it, f))
> > > }
> > >
> > >
> > > The first thing is trying to do the hbaseForeachPartition is getting
> the
> > > credentials but I think this code is never executed:
> > >
> > > /**
> > >  *  underlining wrapper all foreach functions in HBaseContext
> > >  */
> > > private def hbaseForeachPartition[T](configBroadcast:
> > >
> > > Broadcast[SerializableWritable[Configuration]],
> > >                                       it: Iterator[T],
> > >                                       f: (Iterator[T], Connection) =>
> > > Unit) = {
> > >
> > >   val config = getConf(configBroadcast)
> > >
> > >   applyCreds
> > >   // specify that this is a proxy user
> > >   val smartConn = HBaseConnectionCache.getConnection(config)
> > >   f(it, smartConn.connection)
> > >   smartConn.close()
> > > }
> > >
> > >
> > > This is the latest spark-submit I am using:
> > > #!/bin/bash
> > >
> > > SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster \
> > >   --executor-memory 6G \
> > >   --num-executors 10 \
> > >   --queue cards \
> > >   --executor-cores 4 \
> > >   --driver-java-options "-Dlog4j.configuration=file:log4j.properties"
> \
> > >   --driver-java-options "-Djava.security.krb5.conf=/etc/krb5.conf" \
> > >   --driver-java-options
> > > "-Djava.security.auth.login.config=/opt/company/conf/jaas.conf" \
> > >   --driver-class-path "$2" \
> > >   --jars file:/opt/company/lib/rocksdbjni-4.5.1.jar \
> > >   --conf
> > > "spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/
> > > hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/
> > > parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/
> > > cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4.
> > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.
> > > 0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/
> > > hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/
> > > lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_
> > > PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/
> > > phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/
> > > CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar"
> > > \
> > >   --conf
> > > "spark.executor.extraClassPath=/var/cloudera/
> parcels/CDH/lib/hbase/lib/
> > > htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/
> > > jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/
> > > CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/
> > > parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4.
> > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-
> > > 1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1.
> > > jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_
> > > phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2.
> > > 0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce-
> > > client-core-2.6.0-cdh5.5.4.jar"\
> > >   --principal hbase@COMPANY.CORP \
> > >   --keytab /opt/company/conf/hbase.keytab \
> > >   --files
> > > "owl.properties,conf-hbase/log4j.properties,conf-hbase/
> > > hbase-site.xml,conf-hbase/core-site.xml,$2"
> > > \
> > >   --class $1 \
> > >   cards-batch-$3-jar-with-dependencies.jar $2
> > >
> > >
> > >
> > > On Fri, 18 Nov 2016 at 16:37 Abel Fernández <mevsmyself@gmail.com>
> > wrote:
> > >
> > > > No worries.
> > > >
> > > > This is the spark version we are using:  1.5.0-cdh5.5.4
> > > >
> > > > I have to use Hbase context, it is the first parameter for the
> method I
> > > am
> > > > using to generate the HFiles (HbaseRDDFunctions.
> hbaseBulkLoadThinRows)
> > > >
> > > > On Fri, 18 Nov 2016 at 16:06 Nkechi Achara <nkachara@googlemail.com>
> > > > wrote:
> > > >
> > > > Sorry on my way to a flight.
> > > >
> > > > Read is required for a keytab to be permissioned properly. So that
> > looks
> > > > fine in your case.
> > > >
> > > > I do not have my PC with me, but have you tried to use Hbase without
> > > using
> > > > Hbase context.
> > > >
> > > > Also which version of Spark are you using?
> > > >
> > > > On 18 Nov 2016 16:01, "Abel Fernández" <mevsmyself@gmail.com> wrote:
> > > >
> > > > > Yep, the keytab is also in the driver into the same location.
> > > > >
> > > > > -rw-r--r-- 1 hbase root  370 Nov 16 17:13 hbase.keytab
> > > > >
> > > > > Do you know what are the permissions that the keytab should have?
> > > > >
> > > > >
> > > > >
> > > > > On Fri, 18 Nov 2016 at 14:19 Nkechi Achara <
> nkachara@googlemail.com>
> > > > > wrote:
> > > > >
> > > > > > Sorry just realised you had the submit command in the attached
> > docs.
> > > > > >
> > > > > > Can I ask if the keytab is also on the driver in the same
> location?
> > > > > >
> > > > > > The spark option normally requires the keytab to be on the driver
> > so
> > > it
> > > > > can
> > > > > > pick it up and pass it to yarn etc to perform the kerberos
> > > operations.
> > > > > >
> > > > > > On 18 Nov 2016 3:10 p.m., "Abel Fernández" <mevsmyself@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi Nkechi,
> > > > > > >
> > > > > > > Thank for your early response.
> > > > > > >
> > > > > > > I am currently specifying the principal and the keytab
in the
> > > > > > spark-submit,
> > > > > > > the keytab is in the same location in every node manager.
> > > > > > >
> > > > > > > SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster
\
> > > > > > >   --executor-memory 6G \
> > > > > > >   --num-executors 10 \
> > > > > > >   --queue cards \
> > > > > > >   --executor-cores 4 \
> > > > > > >   --driver-java-options "-Dlog4j.configuration=file:
> > > log4j.properties"
> > > > > \
> > > > > > >   --driver-class-path "$2" \
> > > > > > >   --jars file:/opt/orange/lib/rocksdbjni-4.5.1.jar \
> > > > > > >   --conf
> > > > > > > "spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/
> > > > > > > hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/
> > > > > > > parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/
> > > > > > > cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4.
> > > > > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.
> > > > > > > 0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/
> > > > > > > hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/
> > > > > > > lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_
> > > > > > > PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/
> > > > > > > phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/
> > > > > > > CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar"
> > > > > > > \
> > > > > > >   --conf
> > > > > > > "spark.executor.extraClassPath=/var/cloudera/
> > > > > parcels/CDH/lib/hbase/lib/
> > > > > > > htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/
> > > > > > > jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/
> > > > > > > CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/
> > > > > > > parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4.
> > > > > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-
> > > > > > > 1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1.
> > > > > > > jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_
> > > > > > > phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2.
> > > > > > > 0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce-
> > > > > > > client-core-2.6.0-cdh5.5.4.jar"\
> > > > > > >   --principal hbase@COMPANY.CORP \
> > > > > > >   --keytab /opt/company/conf/hbase.keytab \
> > > > > > >   --files
> > > > > > > "owl.properties,conf-hbase/log4j.properties,conf-hbase/
> > > > > > > hbase-site.xml,conf-hbase/core-site.xml,$2"
> > > > > > > \
> > > > > > >   --class $1 \
> > > > > > >   cards-batch-$3-jar-with-dependencies.jar $2
> > > > > > >
> > > > > > > On Fri, 18 Nov 2016 at 14:01 Nkechi Achara <
> > > nkachara@googlemail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Can you use the principal and keytab options in Spark
submit?
> > > These
> > > > > > > should
> > > > > > > > circumvent this issue.
> > > > > > > >
> > > > > > > > On 18 Nov 2016 1:01 p.m., "Abel Fernández" <
> > mevsmyself@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > We are having problems with the delegation of
the token in
> a
> > > > secure
> > > > > > > > > cluster: Delegation Token can be issued only
with kerberos
> or
> > > web
> > > > > > > > > authentication
> > > > > > > > >
> > > > > > > > > We have a spark process which is generating the
hfiles to
> be
> > > > loaded
> > > > > > > into
> > > > > > > > > hbase. To generate these hfiles, (we are using
a
> back-ported
> > > > > version
> > > > > > of
> > > > > > > > the
> > > > > > > > > latest hbase/spark code), we are using this method
> > > > > HBaseRDDFunctions.
> > > > > > > > > hbaseBulkLoadThinRows.
> > > > > > > > >
> > > > > > > > > I think the problem is in the below piece of
code. This
> > > function
> > > > is
> > > > > > > > > executed in every partition of the rdd, when
the executors
> > are
> > > > > trying
> > > > > > > to
> > > > > > > > > execute the code, the executors do not have a
valid
> kerberos
> > > > > > credential
> > > > > > > > and
> > > > > > > > > cannot execute anything.
> > > > > > > > >
> > > > > > > > > private def hbaseForeachPartition[T](configBroadcast:
> > > > > > > > >
> > > > > > Broadcast[SerializableWritable[
> > > > > > > > > Configuration]],
> > > > > > > > >                                         it: Iterator[T],
> > > > > > > > >                                         f: (Iterator[T],
> > > > > Connection)
> > > > > > =>
> > > > > > > > > Unit) = {
> > > > > > > > >
> > > > > > > > >     val config = getConf(configBroadcast)
> > > > > > > > >
> > > > > > > > >     applyCreds
> > > > > > > > >     // specify that this is a proxy user
> > > > > > > > >     val smartConn =
> > HBaseConnectionCache.getConnection(config)
> > > > > > > > >     f(it, smartConn.connection)
> > > > > > > > >     smartConn.close()
> > > > > > > > >   }
> > > > > > > > >
> > > > > > > > > I have attached the spark-submit and the complete
error log
> > > > trace.
> > > > > > Has
> > > > > > > > > anyone faced this problem before?
> > > > > > > > >
> > > > > > > > > Thanks in advance.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Abel.
> > > > > > > > > --
> > > > > > > > > Un saludo - Best Regards.
> > > > > > > > > Abel
> > > > > > > > >
> > > > > > > >
> > > > > > > --
> > > > > > > Un saludo - Best Regards.
> > > > > > > Abel
> > > > > > >
> > > > > >
> > > > > --
> > > > > Un saludo - Best Regards.
> > > > > Abel
> > > > >
> > > >
> > > > --
> > > > Un saludo - Best Regards.
> > > > Abel
> > > >
> > > --
> > > Un saludo - Best Regards.
> > > Abel
> > >
> >
> --
> Un saludo - Best Regards.
> Abel
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message