Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 994A11083A for ; Thu, 18 Dec 2014 03:20:13 +0000 (UTC) Received: (qmail 99673 invoked by uid 500); 18 Dec 2014 03:20:13 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 99631 invoked by uid 500); 18 Dec 2014 03:20:13 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 99620 invoked by uid 99); 18 Dec 2014 03:20:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Dec 2014 03:20:13 +0000 Date: Thu, 18 Dec 2014 03:20:13 +0000 (UTC) From: "Corey J. Nolet (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (ACCUMULO-2971) ChangePassword tool should refuse to run if no write access to HDFS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corey J. Nolet updated ACCUMULO-2971: ------------------------------------- Fix Version/s: (was: 1.6.2) 1.6.3 > ChangePassword tool should refuse to run if no write access to HDFS > ------------------------------------------------------------------- > > Key: ACCUMULO-2971 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2971 > Project: Accumulo > Issue Type: Bug > Affects Versions: 1.5.0, 1.5.1, 1.6.0 > Reporter: Sean Busbey > Priority: Critical > Labels: newbie > Fix For: 1.5.3, 1.7.0, 1.6.3 > > > Currently, the ChangePassword tool doesn't do any check to ensure the user running it has the ability to write to /accumlo/instance_id. > In the event that an admin knows the instance secret but runs the command as a user who can not write to the instance_id, the result is an unhelpful error message and a disconnect between HDFS and zookeeper. > Example for cluster with instance named "foobar" > {code} > [busbey@edge ~]$ hdfs dfs -ls /accumulo/instance_id > Found 1 items > -rw-r--r-- 3 accumulo accumulo 0 2014-07-02 09:05 /accumulo/instance_id/cb977c77-3e13-4522-b718-2b487d722fd4 > [busbey@edge ~]$ accumulo org.apache.accumulo.server.util.ChangeSecret > old zookeeper password: > new zookeeper password: > Thread "org.apache.accumulo.server.util.ChangeSecret" died Permission denied: user=busbey, access=WRITE, inode="/accumulo":accumulo:accumulo:drwxr-x--x > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859) > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642) > at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408) > at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) > org.apache.hadoop.security.AccessControlException: Permission denied: user=busbey, access=WRITE, inode="/accumulo":accumulo:accumulo:drwxr-x--x > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859) > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642) > at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408) > at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90) > at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1489) > at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:355) > at org.apache.accumulo.server.util.ChangeSecret.updateHdfs(ChangeSecret.java:150) > at org.apache.accumulo.server.util.ChangeSecret.main(ChangeSecret.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.accumulo.start.Main$1.run(Main.java:141) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=busbey, access=WRITE, inode="/accumulo":accumulo:accumulo:drwxr-x--x > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859) > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642) > at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408) > at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) > at org.apache.hadoop.ipc.Client.call(Client.java:1238) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) > at $Proxy16.delete(Unknown Source) > at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:408) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) > at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) > at $Proxy17.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1487) > ... 9 more > [busbey@edge ~]$ hdfs dfs -ls /accumulo/instance_id > Found 1 items > -rw-r--r-- 3 accumulo accumulo 0 2014-07-02 09:05 /accumulo/instance_id/cb977c77-3e13-4522-b718-2b487d722fd4 > [busbey@edge ~]$ zookeeper-client > Connecting to localhost:2181 > Welcome to ZooKeeper! > JLine support is enabled > WATCHER:: > WatchedEvent state:SyncConnected type:None path:null > [zk: localhost:2181(CONNECTED) 0] get /accumulo/instances/foobar > 1528cc95-2600-4649-a50e-1645404e9d6c > cZxid = 0xe00034f45 > ctime = Wed Jul 02 09:27:58 PDT 2014 > mZxid = 0xe00034f45 > mtime = Wed Jul 02 09:27:58 PDT 2014 > pZxid = 0xe00034f45 > cversion = 0 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 36 > numChildren = 0 > [zk: localhost:2181(CONNECTED) 1] ls /accumulo/1528cc95-2600-4649-a50e-1645404e9d6c > [users, monitor, problems, root_tablet, gc, hdfs_reservations, table_locks, namespaces, recovery, fate, tservers, tables, next_file, tracers, config, dead, bulk_failed_copyq, masters] > [zk: localhost:2181(CONNECTED) 2] ls /accumulo/cb977c77-3e13-4522-b718-2b487d722fd4 > [users, problems, monitor, root_tablet, hdfs_reservations, gc, table_locks, namespaces, recovery, fate, tservers, tables, next_file, tracers, config, masters, bulk_failed_copyq, dead] > {code} > What's worse, in this condition the cluster will properly come up and show everything fine if the old instance secret is used. > However, clients and servers will now end up looking at different zookeeper nodes depending on wether they used HDFS to get the instance_id or if they use a ZK instance name lookup to get it so long as they use the corresponding instance secret. > Furthermore, if an admin uses the CleanZooKeeper utility subsequent to this failure, it'll cause the loss of the zookeeper nodes the server processes are looking at. > The utility should do a sanity check that /accumulo/instance_id is writable prior to changing zookeeper. It should also wait to update the instance name to instand_id pointer in zookeeper until after HDFS has been updated. > Workaround: manually edit the HDFS instance_id to match the new instance id found zk for the instance name and proceed as though the secret change had succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)