hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abel Fernández <mevsmys...@gmail.com>
Subject Re: hbase/spark - Delegation Token can be issued only with kerberos or web authentication
Date Mon, 21 Nov 2016 16:32:13 GMT
I have included into the spark-submit and into all nodemanagers and drivers
the krb5.conf and the jaas.conf, but I am still having the same problem.

I think the problem is this piece of code, it is trying to execute a
function into the executors and for some reason, the executors cannot get a
valid credentials.

/**
 * A simple enrichment of the traditional Spark RDD foreachPartition.
 * This function differs from the original in that it offers the
 * developer access to a already connected Connection object
 *
 * Note: Do not close the Connection object.  All Connection
 * management is handled outside this method
 *
 * @param rdd  Original RDD with data to iterate over
 * @param f    Function to be given a iterator to iterate through
 *             the RDD values and a Connection object to interact
 *             with HBase
 */
def foreachPartition[T](rdd: RDD[T],
                        f: (Iterator[T], Connection) => Unit):Unit = {
  rdd.foreachPartition(
    it => hbaseForeachPartition(broadcastedConf, it, f))
}


The first thing is trying to do the hbaseForeachPartition is getting the
credentials but I think this code is never executed:

/**
 *  underlining wrapper all foreach functions in HBaseContext
 */
private def hbaseForeachPartition[T](configBroadcast:

Broadcast[SerializableWritable[Configuration]],
                                      it: Iterator[T],
                                      f: (Iterator[T], Connection) => Unit) = {

  val config = getConf(configBroadcast)

  applyCreds
  // specify that this is a proxy user
  val smartConn = HBaseConnectionCache.getConnection(config)
  f(it, smartConn.connection)
  smartConn.close()
}


This is the latest spark-submit I am using:
#!/bin/bash

SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster \
  --executor-memory 6G \
  --num-executors 10 \
  --queue cards \
  --executor-cores 4 \
  --driver-java-options "-Dlog4j.configuration=file:log4j.properties" \
  --driver-java-options "-Djava.security.krb5.conf=/etc/krb5.conf" \
  --driver-java-options
"-Djava.security.auth.login.config=/opt/company/conf/jaas.conf" \
  --driver-class-path "$2" \
  --jars file:/opt/company/lib/rocksdbjni-4.5.1.jar \
  --conf
"spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar"
\
  --conf
"spark.executor.extraClassPath=/var/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar"\
  --principal hbase@COMPANY.CORP \
  --keytab /opt/company/conf/hbase.keytab \
  --files
"owl.properties,conf-hbase/log4j.properties,conf-hbase/hbase-site.xml,conf-hbase/core-site.xml,$2"
\
  --class $1 \
  cards-batch-$3-jar-with-dependencies.jar $2



On Fri, 18 Nov 2016 at 16:37 Abel Fernández <mevsmyself@gmail.com> wrote:

> No worries.
>
> This is the spark version we are using:  1.5.0-cdh5.5.4
>
> I have to use Hbase context, it is the first parameter for the method I am
> using to generate the HFiles (HbaseRDDFunctions.hbaseBulkLoadThinRows)
>
> On Fri, 18 Nov 2016 at 16:06 Nkechi Achara <nkachara@googlemail.com>
> wrote:
>
> Sorry on my way to a flight.
>
> Read is required for a keytab to be permissioned properly. So that looks
> fine in your case.
>
> I do not have my PC with me, but have you tried to use Hbase without using
> Hbase context.
>
> Also which version of Spark are you using?
>
> On 18 Nov 2016 16:01, "Abel Fernández" <mevsmyself@gmail.com> wrote:
>
> > Yep, the keytab is also in the driver into the same location.
> >
> > -rw-r--r-- 1 hbase root  370 Nov 16 17:13 hbase.keytab
> >
> > Do you know what are the permissions that the keytab should have?
> >
> >
> >
> > On Fri, 18 Nov 2016 at 14:19 Nkechi Achara <nkachara@googlemail.com>
> > wrote:
> >
> > > Sorry just realised you had the submit command in the attached docs.
> > >
> > > Can I ask if the keytab is also on the driver in the same location?
> > >
> > > The spark option normally requires the keytab to be on the driver so it
> > can
> > > pick it up and pass it to yarn etc to perform the kerberos operations.
> > >
> > > On 18 Nov 2016 3:10 p.m., "Abel Fernández" <mevsmyself@gmail.com>
> wrote:
> > >
> > > > Hi Nkechi,
> > > >
> > > > Thank for your early response.
> > > >
> > > > I am currently specifying the principal and the keytab in the
> > > spark-submit,
> > > > the keytab is in the same location in every node manager.
> > > >
> > > > SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster \
> > > >   --executor-memory 6G \
> > > >   --num-executors 10 \
> > > >   --queue cards \
> > > >   --executor-cores 4 \
> > > >   --driver-java-options "-Dlog4j.configuration=file:log4j.properties"
> > \
> > > >   --driver-class-path "$2" \
> > > >   --jars file:/opt/orange/lib/rocksdbjni-4.5.1.jar \
> > > >   --conf
> > > > "spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/
> > > > hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/
> > > > parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/
> > > > cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4.
> > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.
> > > > 0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/
> > > > hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/
> > > > lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_
> > > > PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/
> > > > phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/
> > > > CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar"
> > > > \
> > > >   --conf
> > > > "spark.executor.extraClassPath=/var/cloudera/
> > parcels/CDH/lib/hbase/lib/
> > > > htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/
> > > > jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/
> > > > CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/
> > > > parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4.
> > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-
> > > > 1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1.
> > > > jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_
> > > > phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2.
> > > > 0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce-
> > > > client-core-2.6.0-cdh5.5.4.jar"\
> > > >   --principal hbase@COMPANY.CORP \
> > > >   --keytab /opt/company/conf/hbase.keytab \
> > > >   --files
> > > > "owl.properties,conf-hbase/log4j.properties,conf-hbase/
> > > > hbase-site.xml,conf-hbase/core-site.xml,$2"
> > > > \
> > > >   --class $1 \
> > > >   cards-batch-$3-jar-with-dependencies.jar $2
> > > >
> > > > On Fri, 18 Nov 2016 at 14:01 Nkechi Achara <nkachara@googlemail.com>
> > > > wrote:
> > > >
> > > > > Can you use the principal and keytab options in Spark submit? These
> > > > should
> > > > > circumvent this issue.
> > > > >
> > > > > On 18 Nov 2016 1:01 p.m., "Abel Fernández" <mevsmyself@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > We are having problems with the delegation of the token in a
> secure
> > > > > > cluster: Delegation Token can be issued only with kerberos or
web
> > > > > > authentication
> > > > > >
> > > > > > We have a spark process which is generating the hfiles to be
> loaded
> > > > into
> > > > > > hbase. To generate these hfiles, (we are using a back-ported
> > version
> > > of
> > > > > the
> > > > > > latest hbase/spark code), we are using this method
> > HBaseRDDFunctions.
> > > > > > hbaseBulkLoadThinRows.
> > > > > >
> > > > > > I think the problem is in the below piece of code. This function
> is
> > > > > > executed in every partition of the rdd, when the executors are
> > trying
> > > > to
> > > > > > execute the code, the executors do not have a valid kerberos
> > > credential
> > > > > and
> > > > > > cannot execute anything.
> > > > > >
> > > > > > private def hbaseForeachPartition[T](configBroadcast:
> > > > > >
> > > Broadcast[SerializableWritable[
> > > > > > Configuration]],
> > > > > >                                         it: Iterator[T],
> > > > > >                                         f: (Iterator[T],
> > Connection)
> > > =>
> > > > > > Unit) = {
> > > > > >
> > > > > >     val config = getConf(configBroadcast)
> > > > > >
> > > > > >     applyCreds
> > > > > >     // specify that this is a proxy user
> > > > > >     val smartConn = HBaseConnectionCache.getConnection(config)
> > > > > >     f(it, smartConn.connection)
> > > > > >     smartConn.close()
> > > > > >   }
> > > > > >
> > > > > > I have attached the spark-submit and the complete error log
> trace.
> > > Has
> > > > > > anyone faced this problem before?
> > > > > >
> > > > > > Thanks in advance.
> > > > > >
> > > > > > Regards,
> > > > > > Abel.
> > > > > > --
> > > > > > Un saludo - Best Regards.
> > > > > > Abel
> > > > > >
> > > > >
> > > > --
> > > > Un saludo - Best Regards.
> > > > Abel
> > > >
> > >
> > --
> > Un saludo - Best Regards.
> > Abel
> >
>
> --
> Un saludo - Best Regards.
> Abel
>
-- 
Un saludo - Best Regards.
Abel

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message