hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bharath Vissapragada <bhara...@cloudera.com>
Subject Re: Occasional GSSException that brings down region server
Date Tue, 11 Mar 2014 13:13:51 GMT
Hey Wei,

Can you try adding "-Dsun.security.krb5.debug=true" to regionserver jvm
opts and see if it prints something before crash?

- Bharath


On Tue, Mar 11, 2014 at 6:35 PM, Wei Tan <wtan@us.ibm.com> wrote:

> Thanks Ted. Yes our team looked at the doc you pointed out and:
>
> The key here is "every several hours" - so we can rule out 1) valid
> kerberos ticket ~ klist shows a valid ticket
> , 2) [0] does not have our error message ~ link password / keytab / clocks
> / realm is not incorrect ~ all these errors on this page seem to be for
> "does not work at all" conditions... not a "fails every randomly long
> amount of time"
> 3) we don't have this "problematic combination of components" listed...
> but again - this is a work / no work dichotomy...
>
>
> Thanks,
> Wei
>
> ---------------------------------
> Wei Tan, PhD
> Research Staff Member
> IBM T. J. Watson Research Center
> http://researcher.ibm.com/person/us-wtan
>
>
>
> From:   Ted Yu <yuzhihong@gmail.com>
> To:     "user@hbase.apache.org" <user@hbase.apache.org>,
> Date:   03/10/2014 05:31 PM
> Subject:        Re: Occasional GSSException that brings down region server
>
>
>
> Have you looked at
> http://hbase.apache.org/book.html#trouble.client.security.rpc ?
>
>
> On Mon, Mar 10, 2014 at 2:26 PM, Wei Tan <wtan@us.ibm.com> wrote:
>
> > Hi,
> >
> >   We are running a HBase cluster in these settings and with kerberos
> > enabled.
> > HBase: 0.96.1.1
> > Zookeeper: 3.4.5
> > Hadoop: 1.1.1
> >
> >
> > We constantly put data into HBase and every several hours we get the
> error
> > below on a random region server; this error arises and the region server
> > kills itself.
> >
> > ERROR:
> > 2014-02-28 09:32:39,755 ERROR
> [hconnection-0x116987ad-shared--pool1378-t9]
> > security.UserGroupInformation: PriviledgedActionException
> > as:XXXXXXXX@DOMAIN cause:javax.security.sasl.SaslException: GSS initiate
> > failed [Caused by GSSException: No valid credentials provided (Mechanism
> > level: The ticket isn't for us (35) - BAD TGS SERVER NAME)]
> >
> >
> >
> > We also tried with multiple version of kdc - all the way up to latest
> > 1.12.1 - still see this error. What is weird is that most put gets
> > processed successfully until this error occurs and kills the RS.
> >
> > Thanks,
> > Wei
> > ---------------------------------
> > Wei Tan, PhD
> > Research Staff Member
> > IBM T. J. Watson Research Center
> > http://researcher.ibm.com/person/us-wtan
>
>


-- 
Bharath Vissapragada
<http://www.cloudera.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message