hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shankar hiremath <shankar.hirem...@huawei.com>
Subject HBase file encryption, inconsistencies observed and data loss
Date Fri, 25 Jul 2014 14:08:25 GMT
HBase file encryption some inconsistencies observed and data loss happens after running the
hbck tool,
the operation steps are as below.    (one thing what I observed is, on startup of HMaster
if it is not able to process the WAL file, then also it moved to /oldWALs)

Procedure:
1. Start the Hbase services (HMaster & region Server)
2. Enable HFile encryption and WAL file encryption as below, and perform 'table4-0' put operations
(100 records added)
<property>
<name>hbase.crypto.keyprovider</name>
<value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
</property>
<property>
<name>hbase.crypto.keyprovider.parameters</name>
<value>jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234</value>
</property>
<property>
<name>hbase.crypto.master.key.name</name>
<value>hdfs</value>
</property>
<property>
<name>hfile.format.version</name>
<value>3</value>
</property>
<property>
<name>hbase.regionserver.hlog.reader.impl</name>
<value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
</property>
<property>
<name>hbase.regionserver.hlog.writer.impl</name>
<value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
</property>
<property>
<name>hbase.regionserver.wal.encryption</name>
<value>true</value>
</property>
3. Machine went down, so all process went down

4. We disabled the WAL file encryption for performance reason, and keep encryption only for
Hfile, as below
<property>
<name>hbase.crypto.keyprovider</name>
<value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
</property>
<property>
<name>hbase.crypto.keyprovider.parameters</name>
<value>jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234</value>
</property>
<property>
<name>hbase.crypto.master.key.name</name>
<value>hdfs</value>
</property>
<property>
<name>hfile.format.version</name>
<value>3</value>
</property>
5. Start the Region Server and query the 'table4-0' data
hbase(main):003:0> count 'table4-0'
ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332.
is not online on XX-XX-XX-XX,60020,1406209023146
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119)
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)
at java.lang.Thread.run(Thread.java:662)
6. Not able to read the data, so we decided to revert back the configuration (as original)
7. Kill/Stop the Region Server, revert all the configurations as original, as below
<property>
<name>hbase.crypto.keyprovider</name>
<value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
</property>
<property>
<name>hbase.crypto.keyprovider.parameters</name>
<value>jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234</value>
</property>
<property>
<name>hbase.crypto.master.key.name</name>
<value>hdfs</value>
</property>
<property>
<name>hfile.format.version</name>
<value>3</value>
</property>
<property>
<name>hbase.regionserver.hlog.reader.impl</name>
<value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
</property>
<property>
<name>hbase.regionserver.hlog.writer.impl</name>
<value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
</property>
<property>
<name>hbase.regionserver.wal.encryption</name>
<value>true</value>
</property>
7. Start the Region Server, and perform the 'table4-0' query
hbase(main):003:0> count 'table4-0'
ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332.
is not online on XX-XX-XX-XX,60020,1406209023146
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2685)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4119)
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3066)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)
at java.lang.Thread.run(Thread.java:662)
8. Run the hbase hbck to repair, as below
./hbase hbck -details
.........................
Summary:
table1-0 is okay.
Number of regions: 0
Deployed on:
table2-0 is okay.
Number of regions: 0
Deployed on:
table3-0 is okay.
Number of regions: 0
Deployed on:
table4-0 is okay.
Number of regions: 0
Deployed on:
table5-0 is okay.
Number of regions: 0
Deployed on:
table6-0 is okay.
Number of regions: 0
Deployed on:
table7-0 is okay.
Number of regions: 0
Deployed on:
table8-0 is okay.
Number of regions: 0
Deployed on:
table9-0 is okay.
Number of regions: 0
Deployed on:
hbase:meta is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
hbase:acl is okay.
Number of regions: 0
Deployed on:
hbase:namespace is okay.
Number of regions: 0
Deployed on:
22 inconsistencies detected.
Status: INCONSISTENT
2014-07-24 19:13:05,532 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing
master protocol: MasterService
2014-07-24 19:13:05,533 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing
zookeeper sessionid=0x1475d1611611bcf
2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x1475d1611611bcf
2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x1475d1611611bcf
2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading
reply sessionid:0x1475d1611611bcf, packet:: clientPath:null serverPath:null finished:false
header:: 6,-11 replyHeader:: 6,4295102074,0 request:: null response:: null
2014-07-24 19:13:05,546 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session:
0x1475d1611611bcf
2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An
exception was thrown while closing send thread for session 0x1475d1611611bcf : Unable to read
additional data from server sessionid 0x1475d1611611bcf, likely server has closed socket
2014-07-24 19:13:05,546 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2014-07-24 19:13:05,546 INFO [main] zookeeper.ZooKeeper: Session: 0x1475d1611611bcf closed
shankar1@XX-XX-XX-XX:~/DataSight/hbase/bin>
9. Fix the assignments as below
./hbase hbck -fixAssignments
Summary:
table1-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table2-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table3-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table4-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table5-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table6-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table7-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table8-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table9-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
hbase:meta is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
hbase:acl is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
hbase:namespace is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
0 inconsistencies detected.
Status: OK
2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing
master protocol: MasterService
2014-07-24 19:44:55,194 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing
zookeeper sessionid=0x2475d15f7b31b73
2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x2475d15f7b31b73
2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x2475d15f7b31b73
2014-07-24 19:44:55,203 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading
reply sessionid:0x2475d15f7b31b73, packet:: clientPath:null serverPath:null finished:false
header:: 7,-11 replyHeader:: 7,4295102377,0 request:: null response:: null
2014-07-24 19:44:55,203 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session:
0x2475d15f7b31b73
2014-07-24 19:44:55,204 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An
exception was thrown while closing send thread for session 0x2475d15f7b31b73 : Unable to read
additional data from server sessionid 0x2475d15f7b31b73, likely server has closed socket
2014-07-24 19:44:55,204 INFO [main] zookeeper.ZooKeeper: Session: 0x2475d15f7b31b73 closed
2014-07-24 19:44:55,204 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
10. Fix the assignments as below
./hbase hbck -fixAssignments -fixMeta
Summary:
table1-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table2-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table3-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table4-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table5-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table6-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table7-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table8-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
table9-0 is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
hbase:meta is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
hbase:acl is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
hbase:namespace is okay.
Number of regions: 1
Deployed on: XX-XX-XX-XX,60020,1406209023146
0 inconsistencies detected.
Status: OK
2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing
master protocol: MasterService
2014-07-24 19:46:16,290 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing
zookeeper sessionid=0x3475d1605321be9
2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ZooKeeper: Closing session: 0x3475d1605321be9
2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ClientCnxn: Closing client for session: 0x3475d1605321be9
2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: Reading
reply sessionid:0x3475d1605321be9, packet:: clientPath:null serverPath:null finished:false
header:: 6,-11 replyHeader:: 6,4295102397,0 request:: null response:: null
2014-07-24 19:46:16,300 DEBUG [main] zookeeper.ClientCnxn: Disconnecting client for session:
0x3475d1605321be9
2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] zookeeper.ClientCnxn: An
exception was thrown while closing send thread for session 0x3475d1605321be9 : Unable to read
additional data from server sessionid 0x3475d1605321be9, likely server has closed socket
2014-07-24 19:46:16,300 INFO [main] zookeeper.ZooKeeper: Session: 0x3475d1605321be9 closed
2014-07-24 19:46:16,300 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
hbase(main):006:0> count 'table4-0'
0 row(s) in 0.0200 seconds
=> 0
hbase(main):007:0>
Complete data loss happened,
WALs, oldWALs & /hbase/data/default/table4-0/ does not have any data



[X]
This e-mail and its attachments contain confidential information from HUAWEI, which is intended
only for the person or entity whose address is listed above. Any use of the information contained
herein in any way (including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive
this e-mail in error, please notify the sender by phone or email immediately and delete it!
[X]






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message