Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B29A518197 for ; Tue, 26 Jan 2016 21:39:41 +0000 (UTC) Received: (qmail 34021 invoked by uid 500); 26 Jan 2016 21:39:40 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 33942 invoked by uid 500); 26 Jan 2016 21:39:40 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 33560 invoked by uid 99); 26 Jan 2016 21:39:40 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jan 2016 21:39:40 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 30C3FC2477 for ; Tue, 26 Jan 2016 21:39:40 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.1 X-Spam-Level: X-Spam-Status: No, score=-0.1 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id aafyZxq8CSz5 for ; Tue, 26 Jan 2016 21:39:28 +0000 (UTC) Received: from mail-yk0-f173.google.com (mail-yk0-f173.google.com [209.85.160.173]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 4CE4320974 for ; Tue, 26 Jan 2016 21:39:28 +0000 (UTC) Received: by mail-yk0-f173.google.com with SMTP id u68so88674264ykd.2 for ; Tue, 26 Jan 2016 13:39:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=QrmNjZmZDsH0M97iV3wdwphNS2t7xxtTZQ7ebWblisg=; b=fNMQ1/qaGFgE0vR92LDJ/HhJ1SHOx7yMfrjbyVvWHH+phzeuctIaFqktA50NSAZWUF zfImF8UDRrJaKSztGA8l9egbVg10Uof8vBHs8Dqc2z454O+iqwosUVb3QQNC0baVYm82 8fb7FSEMCd0AdTewOD58M4B0D9gn/R1xW/ScWSJqm+rZqiIj6XgtwOyuyjxNa64Vdx5Y ZWmqD3cyJxw+XyyEADDjWDT4T+2OqY8Y91qDg+jHLz505IonzvxMfcYePEPaBnh2O/MJ uDf3tikEfm1p38Azhaw/iwrMQcjqby3uzH1Uk70lOI/PZQE9VPtcTQ/ndVetcFK0FLMy DQnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=QrmNjZmZDsH0M97iV3wdwphNS2t7xxtTZQ7ebWblisg=; b=bg3UNps2xHfXYVMvp2xX0HQpCS4D+7Urd3GnYrD8iHOWWUurJM30wh0JrY3vUZTJEZ nv7fEDXMM3DF4DGWgnLHEv4UbjnJMwwN25txspo8uDPRqPnyn4zrZZe1FH6KaS4qVNUd gc6Urj9uFCOsEbfjtwc6sRxDD6PgEhKR+6gHA+4jqL+1v9G5t5piQNhXlrTimHDPIusD T2jzzHtLqa8BFraK7cZkfOEzdw6BqXI1b87/3r/saJrU5L9CPPxLDylQKdSfmwwUZ+WG Ub6uJ0O98e/qzSsti+gYHBPVwIyjBYE74OYqljq1ZZkXtpNDD0Thtq5/MHcopLocVg4Q Vr5A== X-Gm-Message-State: AG10YOSil632kJHr677KjOpWEuAlNYciVue21AoqgxI0Bwvn/jPpVVrcsuWs9xwI853KJA== X-Received: by 10.37.24.195 with SMTP id 186mr12283995yby.162.1453844367575; Tue, 26 Jan 2016 13:39:27 -0800 (PST) Received: from hw10447.local (pool-96-244-226-201.bltmmd.fios.verizon.net. [96.244.226.201]) by smtp.googlemail.com with ESMTPSA id v63sm2343951ywf.40.2016.01.26.13.39.26 for (version=TLSv1/SSLv3 cipher=OTHER); Tue, 26 Jan 2016 13:39:26 -0800 (PST) Message-ID: <56A7E78D.1050400@gmail.com> Date: Tue, 26 Jan 2016 16:39:25 -0500 From: Josh Elser User-Agent: Postbox 3.0.11 (Macintosh/20140602) MIME-Version: 1.0 To: user@accumulo.apache.org Subject: Re: Accumulo and Kerberos References: <56A7B68D.7010906@gmail.com> <56A7C251.3050907@gmail.com> <56A7D235.3080709@gmail.com> <56A7E01A.6040007@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Your confusion is stemming from what stop-all.sh is actually doing (although, I still have no idea how stopping processes has any bearing on how they start :smile:). Notably, this script will invoke `accumulo admin stopAll` to trigger a graceful shutdown before stopping the services hard (`kill`). So, as you would run these scripts as the 'accumulo' user without Kerberos, you should also be logged in as the 'accumulo' Kerberos user when starting them. This might be missing from the docs. Please do suggest where some documentation should be added to cover this. If it doesn't go without saying, this is a separate issue from your services not logging in correctly. Can you share logs? Try enabling -Dsun.security.kr5b.debug=true in the appropriate environment variable (for the service you want to turn it on for) in accumulo-env.sh and then start the services again (hopefully, sharing that too if the problem isn't obvious). roman.drapeko@baesystems.com wrote: > I want to believe in this, but what I see contradicts this statement.. > > I do bin/stop-all.sh on master. > > If I have a ticket cache for hdfs user, I don't see any errors. > If I don't a have ticket cache for hdfs user, I see these errors. > > I can see that all slaves and master successfully logged in as accumulo user. > > However slaves are failing straight away due to the error I posted in the previous email. I also see this error when I stop the master and I don't' have ticket cache for hdfs user, however I don't see it if I have ticket cache (as per above)... It's kind of a reflection of the previous problem with vfs. > > > > > -----Original Message----- > From: Josh Elser [mailto:josh.elser@gmail.com] > Sent: 26 January 2016 21:08 > To: user@accumulo.apache.org > Subject: Re: Accumulo and Kerberos > > Ok, let me repeat: running a `kinit` in your local shell has *no > bearing* on what Accumulo is doing. This is fundamentally not how it works. There are libraries in the JDK which perform the login with the KDC using the keytab you provide in accumulo-site.xml. Accumulo is not using the ticket cache which your `kinit` creates. > > > > You should see a message in the log stating that the Kerberos login happened (or didn't). The server should exit if it fails to log in (but I don't know if I've actively tested that). Do you see this message? > Does it say you successfully logged in (and the principal you logged in as)? > > roman.drapeko@baesystems.com wrote: >> Ok, there is some progress. So these issues were definitely related to VFS classloader - now works both on the client and master - so I guess a bug is found. >> >> And it looks like there is a very similar issue related to instance_id >> >> On the slaves (does not matter whether I do kinit hdfs or not) I always receive when I start the node: >> >> 2016-01-26 20:36:41,744 [tserver.TabletServer] ERROR: Uncaught exception in TabletServer.main, exiting >> java.lang.RuntimeException: Can't tell if Accumulo is >> initialized; can't read instance id at >> hdfs://cr-platform-qa23-01.cyberreveal.local:8020/accumulo/instance_id >> >> On the master I can see the same issue when I do bin/stop-all.sh without kinit hdfs and it disappears if I have a hdfs ticket. >> >> I tried both: hadoop fs -chown -R accumulo:hdfs /accumulo and hadoop >> fs -chown -R accumulo:accumulo /accumulo - same behavior >> >> Any thoughts please? >> >> >> >> >> -----Original Message----- >> From: Josh Elser [mailto:josh.elser@gmail.com] >> Sent: 26 January 2016 20:08 >> To: user@accumulo.apache.org >> Subject: Re: Accumulo and Kerberos >> >> The normal classloader (on the local filesystem) which is configured out of the box. >> >> roman.drapeko@baesystems.com wrote: >>> Hi Josh, >>> >>> I can confirm that issue on the master is related to VFS classloader! Commented out classloader and now it works without kinit. So it seems it tries loading classes before Kerberos authentication happened. What classloader should I use instead? >>> >>> Regards, >>> Roman >>> >>> -----Original Message----- >>> From: roman.drapeko@baesystems.com >>> [mailto:roman.drapeko@baesystems.com] >>> Sent: 26 January 2016 19:43 >>> To: user@accumulo.apache.org >>> Subject: RE: Accumulo and Kerberos >>> >>> Hi Josh, >>> >>> Two quick questions. >>> >>> 1) What should I use instead of HDFS classloader? All examples seem to be from hdfs. >>> 2) Whan 1.7.1 release is scheduled for (approx.) ? >>> >>> Regards, >>> Roman >>> >>> -----Original Message----- >>> From: Josh Elser [mailto:josh.elser@gmail.com] >>> Sent: 26 January 2016 19:01 >>> To: user@accumulo.apache.org >>> Subject: Re: Accumulo and Kerberos >>> >>> I would strongly recommend that you do not use the HDFS classloader. It is known to be very broken in what you download as 1.7.0. There are a number of JIRA issues about this which stem from a lack of a released commons-vfs2-2.1. >>> >>> That being said, I have not done anything with running Accumulo out of HDFS with Kerberos enabled. AFAIK, you're in untraveled waters. >>> >>> re: the renewal bug: When the ticket expires, the Accumulo service >>> will die. Your options are to deploy a watchdog process that would >>> restart the service, download the fix from the JIRA case and rebuild >>> Accumulo yourself, or build 1.7.1-SNAPSHOT from our codebase. I would >>> recommend using 1.7.1-SNAPSHOT as it should be the least painful >>> (1.7.1-SNAPSHOT now is likely to not change significantly from what >>> is ultimately released as 1.7.1) >>> >>> roman.drapeko@baesystems.com wrote: >>>> Hi Josh, >>>> >>>> Yes, will do. Just in the meantime - I can see a different issue on slave nodes. If I try to start in isolation (bin/start-here.sh) with or without doing kinit I always see the error below. >>>> >>>> 2016-01-26 18:31:13,873 [start.Main] ERROR: Problem initializing the >>>> class loader java.lang.reflect.InvocationTargetException >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>> at org.apache.accumulo.start.Main.getClassLoader(Main.java:68) >>>> at org.apache.accumulo.start.Main.main(Main.java:52) >>>> Caused by: org.apache.commons.vfs2.FileSystemException: Could not determine the type of file "hdfs:///platform/lib/.*.jar". >>>> at org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:1522) >>>> at org.apache.commons.vfs2.provider.AbstractFileObject.getType(AbstractFileObject.java:489) >>>> at org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.resolve(AccumuloVFSClassLoader.java:143) >>>> at org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.resolve(AccumuloVFSClassLoader.java:121) >>>> at org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.getClassLoader(AccumuloVFSClassLoader.java:211) >>>> ... 6 more >>>> Caused by: org.apache.hadoop.security.AccessControlException: SIMPLE >>>> authentication is not enabled. Available:[TOKEN, KERBEROS] >>>> >>>> I guess it might be different to what I observe on the master node. If I don't get ticket explicitly, I get the error mentioned in the previous email. However if do (and it does not matter for what user I have a ticket now - whether it's accumulo, hdfs or hive) - it works. So I started to think, maybe the problem related to some action (for example to vfs as per above) that tries to access HDFS before doing a proper authentication with Kerberos? Any ideas? >>>> >>>> Also, if we go live with 1.7.0 - what approach would you recommend for renewing tickets? Does it require stopping and starting the cluster? >>>> >>>> Regards, >>>> Roman >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: Josh Elser [mailto:josh.elser@gmail.com] >>>> Sent: 26 January 2016 18:10 >>>> To: user@accumulo.apache.org >>>> Subject: Re: Accumulo and Kerberos >>>> >>>> Hi Roman, >>>> >>>> Accumulo services (TabletServer, Master, etc) all use a keytab to automatically obtain a ticket from the KDC when they start up. You do not need to do anything with kinit when starting Accumulo. >>>> >>>> One worry is ACCUMULO-4069[1] with all presently released versions (most notably 1.7.0 which you are using). This is a bug in which services did not automatically renew their ticket. We're working on a 1.7.1, but it's not out yet. >>>> >>>> As for debugging your issue, take a look at the Kerberos section on debugging in the user manual [2]. Take a very close look at the principal the service is using to obtain the ticket and what the principal is for your keytab. A good sanity check is to make sure you can `kinit` in the shell using the keytab and the correct principal (rule out the keytab being incorrect). >>>> >>>> If you still get stuck, collect the output specifying -Dsun.security.krb5.debug=true in accumulo-env.sh (per the instructions) and try enabling log4j DEBUG on org.apache.hadoop.security.UserGroupInformation. >>>> >>>> - Josh >>>> >>>> [1] https://issues.apache.org/jira/browse/ACCUMULO-4069 >>>> [2] >>>> http://accumulo.apache.org/1.7/accumulo_user_manual.html#_debugging >>>> >>>> roman.drapeko@baesystems.com wrote: >>>>> Hi there, >>>>> >>>>> Trying to setup Accumulo 1.7 on Kerberized cluster. Only interested >>>>> in master/tablets to be kerberized (not end-users). Configured >>>>> everything as per manual: >>>>> >>>>> 1)Created principals >>>>> >>>>> 2)Generated glob keytab >>>>> >>>>> 3)Modified accumulo-site.xml providing general.kerberos.keytab and >>>>> general.kerberos.principal >>>>> >>>>> If I start as accumulo user I get: Caused by: GSSException: No >>>>> valid credentials provided (Mechanism level: Failed to find any >>>>> Kerberos >>>>> tgt) >>>>> >>>>> However, if I give explicitly a token with kinit and keytab >>>>> generated above in the shell - it works as expected. To my >>>>> understanding Accumulo has to obtain tickets automatically? Or the >>>>> idea is to write a cron job and apply kinit to every tablet server per day? >>>>> >>>>> Regards, >>>>> >>>>> Roman >>>>> >>>>> Please consider the environment before printing this email. This >>>>> message should be regarded as confidential. If you have received >>>>> this email in error please notify the sender and destroy it immediately. >>>>> Statements of intent shall only become binding when confirmed in >>>>> hard copy by an authorised signatory. The contents of this email >>>>> may relate to dealings with other companies under the control of >>>>> BAE Systems Applied Intelligence Limited, details of which can be >>>>> found at http://www.baesystems.com/Businesses/index.htm. >>>> Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm. >>> Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm. >>> Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm. >> Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm. > Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm.