hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nkechi Achara <nkach...@googlemail.com>
Subject Re: hbase/spark - Delegation Token can be issued only with kerberos or web authentication
Date Mon, 21 Nov 2016 21:11:42 GMT
I am still convinced that it could be due to class path issues but I might
be missing something.

Just to make sure.... Have you checked the use of the principal / keytab
only on the driver only so you can make sure the tgt is valid.

I am using the same config but with CDH 5.5.2, but I am using a retrofit of
cloudera labs hbase on spark.

Thanks

On 21 Nov 2016 5:32 p.m., "Abel Fernández" <mevsmyself@gmail.com> wrote:

> I have included into the spark-submit and into all nodemanagers and drivers
> the krb5.conf and the jaas.conf, but I am still having the same problem.
>
> I think the problem is this piece of code, it is trying to execute a
> function into the executors and for some reason, the executors cannot get a
> valid credentials.
>
> /**
>  * A simple enrichment of the traditional Spark RDD foreachPartition.
>  * This function differs from the original in that it offers the
>  * developer access to a already connected Connection object
>  *
>  * Note: Do not close the Connection object.  All Connection
>  * management is handled outside this method
>  *
>  * @param rdd  Original RDD with data to iterate over
>  * @param f    Function to be given a iterator to iterate through
>  *             the RDD values and a Connection object to interact
>  *             with HBase
>  */
> def foreachPartition[T](rdd: RDD[T],
>                         f: (Iterator[T], Connection) => Unit):Unit = {
>   rdd.foreachPartition(
>     it => hbaseForeachPartition(broadcastedConf, it, f))
> }
>
>
> The first thing is trying to do the hbaseForeachPartition is getting the
> credentials but I think this code is never executed:
>
> /**
>  *  underlining wrapper all foreach functions in HBaseContext
>  */
> private def hbaseForeachPartition[T](configBroadcast:
>
> Broadcast[SerializableWritable[Configuration]],
>                                       it: Iterator[T],
>                                       f: (Iterator[T], Connection) =>
> Unit) = {
>
>   val config = getConf(configBroadcast)
>
>   applyCreds
>   // specify that this is a proxy user
>   val smartConn = HBaseConnectionCache.getConnection(config)
>   f(it, smartConn.connection)
>   smartConn.close()
> }
>
>
> This is the latest spark-submit I am using:
> #!/bin/bash
>
> SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster \
>   --executor-memory 6G \
>   --num-executors 10 \
>   --queue cards \
>   --executor-cores 4 \
>   --driver-java-options "-Dlog4j.configuration=file:log4j.properties" \
>   --driver-java-options "-Djava.security.krb5.conf=/etc/krb5.conf" \
>   --driver-java-options
> "-Djava.security.auth.login.config=/opt/company/conf/jaas.conf" \
>   --driver-class-path "$2" \
>   --jars file:/opt/company/lib/rocksdbjni-4.5.1.jar \
>   --conf
> "spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/
> hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/
> parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/
> cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4.
> jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.
> 0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/
> hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/
> lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_
> PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/
> phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/
> CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar"
> \
>   --conf
> "spark.executor.extraClassPath=/var/cloudera/parcels/CDH/lib/hbase/lib/
> htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/
> jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/
> CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/
> parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4.
> jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-
> 1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1.
> jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_
> phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2.
> 0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce-
> client-core-2.6.0-cdh5.5.4.jar"\
>   --principal hbase@COMPANY.CORP \
>   --keytab /opt/company/conf/hbase.keytab \
>   --files
> "owl.properties,conf-hbase/log4j.properties,conf-hbase/
> hbase-site.xml,conf-hbase/core-site.xml,$2"
> \
>   --class $1 \
>   cards-batch-$3-jar-with-dependencies.jar $2
>
>
>
> On Fri, 18 Nov 2016 at 16:37 Abel Fernández <mevsmyself@gmail.com> wrote:
>
> > No worries.
> >
> > This is the spark version we are using:  1.5.0-cdh5.5.4
> >
> > I have to use Hbase context, it is the first parameter for the method I
> am
> > using to generate the HFiles (HbaseRDDFunctions.hbaseBulkLoadThinRows)
> >
> > On Fri, 18 Nov 2016 at 16:06 Nkechi Achara <nkachara@googlemail.com>
> > wrote:
> >
> > Sorry on my way to a flight.
> >
> > Read is required for a keytab to be permissioned properly. So that looks
> > fine in your case.
> >
> > I do not have my PC with me, but have you tried to use Hbase without
> using
> > Hbase context.
> >
> > Also which version of Spark are you using?
> >
> > On 18 Nov 2016 16:01, "Abel Fernández" <mevsmyself@gmail.com> wrote:
> >
> > > Yep, the keytab is also in the driver into the same location.
> > >
> > > -rw-r--r-- 1 hbase root  370 Nov 16 17:13 hbase.keytab
> > >
> > > Do you know what are the permissions that the keytab should have?
> > >
> > >
> > >
> > > On Fri, 18 Nov 2016 at 14:19 Nkechi Achara <nkachara@googlemail.com>
> > > wrote:
> > >
> > > > Sorry just realised you had the submit command in the attached docs.
> > > >
> > > > Can I ask if the keytab is also on the driver in the same location?
> > > >
> > > > The spark option normally requires the keytab to be on the driver so
> it
> > > can
> > > > pick it up and pass it to yarn etc to perform the kerberos
> operations.
> > > >
> > > > On 18 Nov 2016 3:10 p.m., "Abel Fernández" <mevsmyself@gmail.com>
> > wrote:
> > > >
> > > > > Hi Nkechi,
> > > > >
> > > > > Thank for your early response.
> > > > >
> > > > > I am currently specifying the principal and the keytab in the
> > > > spark-submit,
> > > > > the keytab is in the same location in every node manager.
> > > > >
> > > > > SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster \
> > > > >   --executor-memory 6G \
> > > > >   --num-executors 10 \
> > > > >   --queue cards \
> > > > >   --executor-cores 4 \
> > > > >   --driver-java-options "-Dlog4j.configuration=file:
> log4j.properties"
> > > \
> > > > >   --driver-class-path "$2" \
> > > > >   --jars file:/opt/orange/lib/rocksdbjni-4.5.1.jar \
> > > > >   --conf
> > > > > "spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/
> > > > > hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/
> > > > > parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/
> > > > > cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4.
> > > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.
> > > > > 0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/
> > > > > hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/
> > > > > lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_
> > > > > PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/
> > > > > phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/
> > > > > CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar"
> > > > > \
> > > > >   --conf
> > > > > "spark.executor.extraClassPath=/var/cloudera/
> > > parcels/CDH/lib/hbase/lib/
> > > > > htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/
> > > > > jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/
> > > > > CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/
> > > > > parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4.
> > > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-
> > > > > 1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1.
> > > > > jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_
> > > > > phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2.
> > > > > 0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce-
> > > > > client-core-2.6.0-cdh5.5.4.jar"\
> > > > >   --principal hbase@COMPANY.CORP \
> > > > >   --keytab /opt/company/conf/hbase.keytab \
> > > > >   --files
> > > > > "owl.properties,conf-hbase/log4j.properties,conf-hbase/
> > > > > hbase-site.xml,conf-hbase/core-site.xml,$2"
> > > > > \
> > > > >   --class $1 \
> > > > >   cards-batch-$3-jar-with-dependencies.jar $2
> > > > >
> > > > > On Fri, 18 Nov 2016 at 14:01 Nkechi Achara <
> nkachara@googlemail.com>
> > > > > wrote:
> > > > >
> > > > > > Can you use the principal and keytab options in Spark submit?
> These
> > > > > should
> > > > > > circumvent this issue.
> > > > > >
> > > > > > On 18 Nov 2016 1:01 p.m., "Abel Fernández" <mevsmyself@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > We are having problems with the delegation of the token
in a
> > secure
> > > > > > > cluster: Delegation Token can be issued only with kerberos
or
> web
> > > > > > > authentication
> > > > > > >
> > > > > > > We have a spark process which is generating the hfiles
to be
> > loaded
> > > > > into
> > > > > > > hbase. To generate these hfiles, (we are using a back-ported
> > > version
> > > > of
> > > > > > the
> > > > > > > latest hbase/spark code), we are using this method
> > > HBaseRDDFunctions.
> > > > > > > hbaseBulkLoadThinRows.
> > > > > > >
> > > > > > > I think the problem is in the below piece of code. This
> function
> > is
> > > > > > > executed in every partition of the rdd, when the executors
are
> > > trying
> > > > > to
> > > > > > > execute the code, the executors do not have a valid kerberos
> > > > credential
> > > > > > and
> > > > > > > cannot execute anything.
> > > > > > >
> > > > > > > private def hbaseForeachPartition[T](configBroadcast:
> > > > > > >
> > > > Broadcast[SerializableWritable[
> > > > > > > Configuration]],
> > > > > > >                                         it: Iterator[T],
> > > > > > >                                         f: (Iterator[T],
> > > Connection)
> > > > =>
> > > > > > > Unit) = {
> > > > > > >
> > > > > > >     val config = getConf(configBroadcast)
> > > > > > >
> > > > > > >     applyCreds
> > > > > > >     // specify that this is a proxy user
> > > > > > >     val smartConn = HBaseConnectionCache.getConnection(config)
> > > > > > >     f(it, smartConn.connection)
> > > > > > >     smartConn.close()
> > > > > > >   }
> > > > > > >
> > > > > > > I have attached the spark-submit and the complete error
log
> > trace.
> > > > Has
> > > > > > > anyone faced this problem before?
> > > > > > >
> > > > > > > Thanks in advance.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Abel.
> > > > > > > --
> > > > > > > Un saludo - Best Regards.
> > > > > > > Abel
> > > > > > >
> > > > > >
> > > > > --
> > > > > Un saludo - Best Regards.
> > > > > Abel
> > > > >
> > > >
> > > --
> > > Un saludo - Best Regards.
> > > Abel
> > >
> >
> > --
> > Un saludo - Best Regards.
> > Abel
> >
> --
> Un saludo - Best Regards.
> Abel
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message