Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1056011496 for ; Thu, 24 Jul 2014 14:40:40 +0000 (UTC) Received: (qmail 69081 invoked by uid 500); 24 Jul 2014 14:40:39 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 68984 invoked by uid 500); 24 Jul 2014 14:40:38 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 68623 invoked by uid 99); 24 Jul 2014 14:40:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jul 2014 14:40:38 +0000 Date: Thu, 24 Jul 2014 14:40:38 +0000 (UTC) From: "shankarlingayya (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-11584) HBase file encryption, consistences observed and data loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shankarlingayya updated HBASE-11584: ------------------------------------ Description: Procedure: 1. Start the Hbase services (HMaster & region Server) 2. Enable HFile encryption and WAL file encryption as below, and perform 'table4-0' put operations (100 records added) hbase.crypto.keyprovider org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider hbase.crypto.keyprovider.parameters jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 hbase.crypto.master.key.name hdfs hfile.format.version 3 hbase.regionserver.hlog.reader.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader hbase.regionserver.hlog.writer.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter hbase.regionserver.wal.encryption true 3. Machine went down, so all process went down 4. We disabled the WAL file encryption for performance reason, and keep encryption only for Hfile, as below hbase.crypto.keyprovider org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider hbase.crypto.keyprovider.parameters jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 hbase.crypto.master.key.name hdfs hfile.format.version 3 5. Start the Region Server and query the 'table4-0' data hbase(main):003:0> count 'table4-0' ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not online on XX-XX-XX-XX,60020,1406209023146 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685) at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) at java.lang.Thread.run(Thread.java:662) 6. Not able to read the data, so we decided to revert back the configuration (as original) 7. Kill/Stop the Region Server, revert all the configurations as original, as below hbase.crypto.keyprovider org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider hbase.crypto.keyprovider.parameters jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 hbase.crypto.master.key.name hdfs hfile.format.version 3 hbase.regionserver.hlog.reader.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader hbase.regionserver.hlog.writer.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter hbase.regionserver.wal.encryption true 7. Start the Region Server, and perform the 'table4-0' query hbase(main):003:0> count 'table4-0' ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not online on XX-XX-XX-XX,60020,1406209023146 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685) at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) at java.lang.Thread.run(Thread.java:662) 8. Run the hbase hbck to repair, as below ./hbase hbck -details ......................... Summary: table1-0 is okay. Number of regions: 0 Deployed on: table2-0 is okay. Number of regions: 0 Deployed on: table3-0 is okay. Number of regions: 0 Deployed on: table4-0 is okay. Number of regions: 0 Deployed on: table5-0 is okay. Number of regions: 0 Deployed on: table6-0 is okay. Number of regions: 0 Deployed on: table7-0 is okay. Number of regions: 0 Deployed on: table8-0 is okay. Number of regions: 0 Deployed on: table9-0 is okay. Number of regions: 0 Deployed on: hbase:meta is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. Number of regions: 0 Deployed on: hbase:namespace is okay. Number of regions: 0 Deployed on: 22 inconsistencies detected. Status: INCONSISTENT 2014-07-24 19:13:05,532 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2014-07-24 19:13:05,533 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1475d1611611bcf 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x1475d1611611bcf 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x1475d1611611bcf 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x1475d1611611bcf, packet:: clientPath:null serverPath:null finished:false header:: 6,-11 replyHeader:: 6,4295102074,0 request:: null response:: null 2014-07-24 19:13:05,546 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x1475d1611611bcf 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x1475d1611611bcf : Unable to read additional data from server sessionid 0x1475d1611611bcf, likely server has closed socket 2014-07-24 19:13:05,546 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down 2014-07-24 19:13:05,546 INFO [main] zookeeper.ZooKeeper: Session: 0x1475d1611611bcf closed shankar1@XX-XX-XX-XX:~/DataSight/hbase/bin> 9. Fix the assignments as below ./hbase hbck -fixAssignments Summary: table1-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table2-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table3-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table4-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table5-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table6-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table7-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table8-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table9-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:meta is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:namespace is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 0 inconsistencies detected. Status: OK 2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x2475d15f7b31b73 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x2475d15f7b31b73 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x2475d15f7b31b73 2014-07-24 19:44:55,203 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x2475d15f7b31b73, packet:: clientPath:null serverPath:null finished:false header:: 7,-11 replyHeader:: 7,4295102377,0 request:: null response:: null 2014-07-24 19:44:55,203 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x2475d15f7b31b73 2014-07-24 19:44:55,204 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x2475d15f7b31b73 : Unable to read additional data from server sessionid 0x2475d15f7b31b73, likely server has closed socket 2014-07-24 19:44:55,204 INFO [main] zookeeper.ZooKeeper: Session: 0x2475d15f7b31b73 closed 2014-07-24 19:44:55,204 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down 10. Fix the assignments as below ./hbase hbck -fixAssignments -fixMeta Summary: table1-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table2-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table3-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table4-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table5-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table6-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table7-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table8-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table9-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:meta is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:namespace is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 0 inconsistencies detected. Status: OK 2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3475d1605321be9 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x3475d1605321be9 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x3475d1605321be9 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x3475d1605321be9, packet:: clientPath:null serverPath:null finished:false header:: 6,-11 replyHeader:: 6,4295102397,0 request:: null response:: null 2014-07-24 19:46:16,300 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x3475d1605321be9 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x3475d1605321be9 : Unable to read additional data from server sessionid 0x3475d1605321be9, likely server has closed socket 2014-07-24 19:46:16,300 INFO [main] zookeeper.ZooKeeper: Session: 0x3475d1605321be9 closed 2014-07-24 19:46:16,300 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down hbase(main):006:0> count 'table4-0' 0 row(s) in 0.0200 seconds => 0 hbase(main):007:0> Complete data loss happened, WALs, oldWALs & /hbase/data/default/table4-0/ does not have any data was: Procedure: 1. Start the Hbase services (HMaster & region Server) 2. Enable HFile encryption and WAL file encryption as below, and perform 'table4-0' put operations (100 records added) hbase.crypto.keyprovider org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider hbase.crypto.keyprovider.parameters jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 hbase.crypto.master.key.name hdfs hfile.format.version 3 hbase.regionserver.hlog.reader.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader hbase.regionserver.hlog.writer.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter hbase.regionserver.wal.encryption true 3. Machine went down, so all process went down 4. We disabled the WAL file encryption for performance reason, and keep encryption only for Hfile, as below hbase.crypto.keyprovider org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider hbase.crypto.keyprovider.parameters jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 hbase.crypto.master.key.name hdfs hfile.format.version 3 5. Start the Region Server and query the 'table4-0' data hbase(main):003:0> count 'table4-0' ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not online on XX-XX-XX-XX,60020,1406209023146 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685) at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) at java.lang.Thread.run(Thread.java:662) 6. Not able to read the data, so we decided to revert back the configuration (as original) 7. Kill/Stop the Region Server, revert all the configurations as original, as below hbase.crypto.keyprovider org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider hbase.crypto.keyprovider.parameters jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 hbase.crypto.master.key.name hdfs hfile.format.version 3 hbase.regionserver.hlog.reader.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader hbase.regionserver.hlog.writer.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter hbase.regionserver.wal.encryption true 7. Start the Region Server, and perform the 'table4-0' query hbase(main):003:0> count 'table4-0' ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not online on XX-XX-XX-XX,60020,1406209023146 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685) at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119) at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) at java.lang.Thread.run(Thread.java:662) 8. Run the hbase hbck to repair, as below ./hbase hbck -details ......................... Summary: table1-0 is okay. Number of regions: 0 Deployed on: table2-0 is okay. Number of regions: 0 Deployed on: table3-0 is okay. Number of regions: 0 Deployed on: table4-0 is okay. Number of regions: 0 Deployed on: table5-0 is okay. Number of regions: 0 Deployed on: table6-0 is okay. Number of regions: 0 Deployed on: table7-0 is okay. Number of regions: 0 Deployed on: table8-0 is okay. Number of regions: 0 Deployed on: table9-0 is okay. Number of regions: 0 Deployed on: hbase:meta is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. Number of regions: 0 Deployed on: hbase:namespace is okay. Number of regions: 0 Deployed on: 22 inconsistencies detected. Status: INCONSISTENT 2014-07-24 19:13:05,532 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2014-07-24 19:13:05,533 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1475d1611611bcf 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x1475d1611611bcf 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x1475d1611611bcf 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x1475d1611611bcf, packet:: clientPath:null serverPath:null finished:false header:: 6,-11 replyHeader:: 6,4295102074,0 request:: null response:: null 2014-07-24 19:13:05,546 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x1475d1611611bcf 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x1475d1611611bcf : Unable to read additional data from server sessionid 0x1475d1611611bcf, likely server has closed socket 2014-07-24 19:13:05,546 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down 2014-07-24 19:13:05,546 INFO [main] zookeeper.ZooKeeper: Session: 0x1475d1611611bcf closed shankar1@XX-XX-XX-XX:~/DataSight/hbase/bin> 9. Fix the assignments as below ./hbase hbck -fixAssignments Summary: table1-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table2-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table3-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table4-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table5-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table6-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table7-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table8-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table9-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:meta is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:namespace is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 0 inconsistencies detected. Status: OK 2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x2475d15f7b31b73 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x2475d15f7b31b73 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x2475d15f7b31b73 2014-07-24 19:44:55,203 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x2475d15f7b31b73, packet:: clientPath:null serverPath:null finished:false header:: 7,-11 replyHeader:: 7,4295102377,0 request:: null response:: null 2014-07-24 19:44:55,203 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x2475d15f7b31b73 2014-07-24 19:44:55,204 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x2475d15f7b31b73 : Unable to read additional data from server sessionid 0x2475d15f7b31b73, likely server has closed socket 2014-07-24 19:44:55,204 INFO [main] zookeeper.ZooKeeper: Session: 0x2475d15f7b31b73 closed 2014-07-24 19:44:55,204 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down 10. Fix the assignments as below ./hbase hbck -fixAssignments -fixMeta Summary: table1-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table2-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table3-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table4-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table5-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table6-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table7-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table8-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 table9-0 is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:meta is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:namespace is okay. Number of regions: 1 Deployed on: XX-XX-XX-XX,60020,1406209023146 0 inconsistencies detected. Status: OK 2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3475d1605321be9 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x3475d1605321be9 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x3475d1605321be9 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x3475d1605321be9, packet:: clientPath:null serverPath:null finished:false header:: 6,-11 replyHeader:: 6,4295102397,0 request:: null response:: null 2014-07-24 19:46:16,300 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x3475d1605321be9 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x3475d1605321be9 : Unable to read additional data from server sessionid 0x3475d1605321be9, likely server has closed socket 2014-07-24 19:46:16,300 INFO [main] zookeeper.ZooKeeper: Session: 0x3475d1605321be9 closed 2014-07-24 19:46:16,300 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down hbase(main):006:0> count 'table4-0' 0 row(s) in 0.0200 seconds => 0 hbase(main):007:0> Complete data loss happened, WALs, oldWALs & /hbase/data/default/table4-0/ does not have any data > HBase file encryption, consistences observed and data loss > ---------------------------------------------------------- > > Key: HBASE-11584 > URL: https://issues.apache.org/jira/browse/HBASE-11584 > Project: HBase > Issue Type: Bug > Components: hbck, HFile > Affects Versions: 0.98.3 > Environment: SuSE 11 SP3 > Reporter: shankarlingayya > Priority: Critical > > Procedure: > 1. Start the Hbase services (HMaster & region Server) > 2. Enable HFile encryption and WAL file encryption as below, and perform 'table4-0' put operations (100 records added) > > hbase.crypto.keyprovider > org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider > > > hbase.crypto.keyprovider.parameters > jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 > > > hbase.crypto.master.key.name > hdfs > > > hfile.format.version > 3 > > > hbase.regionserver.hlog.reader.impl > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader > > > hbase.regionserver.hlog.writer.impl > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter > > > hbase.regionserver.wal.encryption > true > > > 3. Machine went down, so all process went down > 4. We disabled the WAL file encryption for performance reason, and keep encryption only for Hfile, as below > > hbase.crypto.keyprovider > org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider > > > hbase.crypto.keyprovider.parameters > jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 > > > hbase.crypto.master.key.name > hdfs > > > hfile.format.version > 3 > > 5. Start the Region Server and query the 'table4-0' data > hbase(main):003:0> count 'table4-0' > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not online on XX-XX-XX-XX,60020,1406209023146 > at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685) > at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119) > at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066) > at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) > at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) > at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) > at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) > at java.lang.Thread.run(Thread.java:662) > 6. Not able to read the data, so we decided to revert back the configuration (as original) > 7. Kill/Stop the Region Server, revert all the configurations as original, as below > > hbase.crypto.keyprovider > org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider > > > hbase.crypto.keyprovider.parameters > jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 > > > hbase.crypto.master.key.name > hdfs > > > hfile.format.version > 3 > > > hbase.regionserver.hlog.reader.impl > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader > > > hbase.regionserver.hlog.writer.impl > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter > > > hbase.regionserver.wal.encryption > true > > 7. Start the Region Server, and perform the 'table4-0' query > hbase(main):003:0> count 'table4-0' > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not online on XX-XX-XX-XX,60020,1406209023146 > at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685) > at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119) > at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066) > at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) > at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) > at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) > at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) > at java.lang.Thread.run(Thread.java:662) > 8. Run the hbase hbck to repair, as below > ./hbase hbck -details > ......................... > Summary: > table1-0 is okay. > Number of regions: 0 > Deployed on: > table2-0 is okay. > Number of regions: 0 > Deployed on: > table3-0 is okay. > Number of regions: 0 > Deployed on: > table4-0 is okay. > Number of regions: 0 > Deployed on: > table5-0 is okay. > Number of regions: 0 > Deployed on: > table6-0 is okay. > Number of regions: 0 > Deployed on: > table7-0 is okay. > Number of regions: 0 > Deployed on: > table8-0 is okay. > Number of regions: 0 > Deployed on: > table9-0 is okay. > Number of regions: 0 > Deployed on: > hbase:meta is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > hbase:acl is okay. > Number of regions: 0 > Deployed on: > hbase:namespace is okay. > Number of regions: 0 > Deployed on: > 22 inconsistencies detected. > Status: INCONSISTENT > 2014-07-24 19:13:05,532 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService > 2014-07-24 19:13:05,533 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1475d1611611bcf > 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x1475d1611611bcf > 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x1475d1611611bcf > 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x1475d1611611bcf, packet:: clientPath:null serverPath:null finished:false header:: 6,-11 replyHeader:: 6,4295102074,0 request:: null response:: null > 2014-07-24 19:13:05,546 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x1475d1611611bcf > 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x1475d1611611bcf : Unable to read additional data from server sessionid 0x1475d1611611bcf, likely server has closed socket > 2014-07-24 19:13:05,546 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down > 2014-07-24 19:13:05,546 INFO [main] zookeeper.ZooKeeper: Session: 0x1475d1611611bcf closed > shankar1@XX-XX-XX-XX:~/DataSight/hbase/bin> > 9. Fix the assignments as below > ./hbase hbck -fixAssignments > Summary: > table1-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table2-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table3-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table4-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table5-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table6-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table7-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table8-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table9-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > hbase:meta is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > hbase:acl is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > hbase:namespace is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > 0 inconsistencies detected. > Status: OK > 2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService > 2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x2475d15f7b31b73 > 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x2475d15f7b31b73 > 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x2475d15f7b31b73 > 2014-07-24 19:44:55,203 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x2475d15f7b31b73, packet:: clientPath:null serverPath:null finished:false header:: 7,-11 replyHeader:: 7,4295102377,0 request:: null response:: null > 2014-07-24 19:44:55,203 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x2475d15f7b31b73 > 2014-07-24 19:44:55,204 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x2475d15f7b31b73 : Unable to read additional data from server sessionid 0x2475d15f7b31b73, likely server has closed socket > 2014-07-24 19:44:55,204 INFO [main] zookeeper.ZooKeeper: Session: 0x2475d15f7b31b73 closed > 2014-07-24 19:44:55,204 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down > 10. Fix the assignments as below > ./hbase hbck -fixAssignments -fixMeta > Summary: > table1-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table2-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table3-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table4-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table5-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table6-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table7-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table8-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > table9-0 is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > hbase:meta is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > hbase:acl is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > hbase:namespace is okay. > Number of regions: 1 > Deployed on: XX-XX-XX-XX,60020,1406209023146 > 0 inconsistencies detected. > Status: OK > 2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService > 2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3475d1605321be9 > 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x3475d1605321be9 > 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x3475d1605321be9 > 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x3475d1605321be9, packet:: clientPath:null serverPath:null finished:false header:: 6,-11 replyHeader:: 6,4295102397,0 request:: null response:: null > 2014-07-24 19:46:16,300 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session: 0x3475d1605321be9 > 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x3475d1605321be9 : Unable to read additional data from server sessionid 0x3475d1605321be9, likely server has closed socket > 2014-07-24 19:46:16,300 INFO [main] zookeeper.ZooKeeper: Session: 0x3475d1605321be9 closed > 2014-07-24 19:46:16,300 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down > hbase(main):006:0> count 'table4-0' > 0 row(s) in 0.0200 seconds > => 0 > hbase(main):007:0> > Complete data loss happened, > WALs, oldWALs & /hbase/data/default/table4-0/ does not have any data -- This message was sent by Atlassian JIRA (v6.2#6252)