Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 57DBA117FC for ; Sun, 27 Jul 2014 11:43:43 +0000 (UTC) Received: (qmail 60644 invoked by uid 500); 27 Jul 2014 11:43:39 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 60579 invoked by uid 500); 27 Jul 2014 11:43:39 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 60567 invoked by uid 99); 27 Jul 2014 11:43:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Jul 2014 11:43:38 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of anoop.hbase@gmail.com designates 209.85.217.179 as permitted sender) Received: from [209.85.217.179] (HELO mail-lb0-f179.google.com) (209.85.217.179) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Jul 2014 11:43:34 +0000 Received: by mail-lb0-f179.google.com with SMTP id v6so4831136lbi.10 for ; Sun, 27 Jul 2014 04:43:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=gFE4JYLx9yFf27GuVCIioRHsFxK6TkOGnWCTa7+NQlk=; b=oY2lJmh4/b8CruCtCzIoo7R9/9/pHAkcCEjH0YreerXPT7GiAG0zq8aetxZhw0ZK1H u1Ik02BVYRTLcQ04UK7qzt1WJTqUu937H5VSEVYJM0BTImQ/g0AflGf/qM5BZdz2A7I1 MoAW5r6sKuX5QfrGfbTJKmVYeyarvVuDoJKpc4LMyHvb69DmRS3sGYznCqXR48mgztEZ nvoCx7ouzMDlXPxMW+M9mHbU1o6M2SaUk/S+uThpUjO91/45SDp7huo+l7jEQEUZt2jL 4Kgdm9qPg+FcsSRohfKigz3K8InY+XmEn9LHJm3EfbsXSj+bq4mB6Dqn0heqx6smDJCs 1zeg== MIME-Version: 1.0 X-Received: by 10.112.158.199 with SMTP id ww7mr13425185lbb.71.1406461392275; Sun, 27 Jul 2014 04:43:12 -0700 (PDT) Received: by 10.112.143.102 with HTTP; Sun, 27 Jul 2014 04:43:12 -0700 (PDT) In-Reply-To: References: <8594AB46-DB50-4ECC-B51F-DDCAAFD02B83@gmail.com> Date: Sun, 27 Jul 2014 17:13:12 +0530 Message-ID: Subject: Re: HBase file encryption, inconsistencies observed and data loss From: Anoop John To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a11c33aa8ad555704ff2b4fd8 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c33aa8ad555704ff2b4fd8 Content-Type: text/plain; charset=UTF-8 SecureProtobufLogReader can read encrypted as well as unencrypted files. Anoop On Sunday, July 27, 2014, ramkrishna vasudevan < ramkrishna.s.vasudevan@gmail.com> wrote: > I think in the above case though encryption is disabled we will need to use > the securelogreader only for the new files also that will be created? I > don have code with me now. But if that is the case need to see it as I feel > only the existing one should be read with securelogreader. The new wal > should be read using log reader. > Moving to corrupt folder is fine unless we could bring it back to the main > working for. > Sent from mobile excuse any typos. > On Jul 27, 2014 10:07 AM, "Anoop John" wrote: > >> As per Shankar he can get things work with below configs >> >> >> hbase.regionserver.hlog.reader.impl >> >> >> org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader >> >> >> hbase.regionserver.hlog.writer.impl >> >> >> org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter >> >> >> hbase.regionserver.wal.encryption >> false >> >> >> Once the RS crash happened, the config is maintained above way. See that >> WAL encryption is disabled now. Still note that the reader is >> SecureProtobufLogReader. The existing WAL files are with encryption and >> only SecureProtobufLogReader can read them. So if that is not configured, >> the default reader is. ProtobufLogReader can not read them back >> correctly. So this is the issue that Shankar faced. >> >> Also when the file can not be read, this is not moved under corrupt logs is >> a concerning thing. Need to look at that. >> >> -Anoop- >> >> On Sat, Jul 26, 2014 at 11:17 PM, Andrew Purtell < andrew.purtell@gmail.com >> > >> wrote: >> >> > My attempt to reproduce this issue: >> > >> > 1. Set up Hadoop 2.4.1 namenode, secondarynamenode, and datanode on a dev >> > box. >> > >> > 2. Set up HBase 0.98.5-SNAPSHOT hosted zk, master, and regionserver also >> on >> > this dev box. >> > >> > 3. Set dfs.replication and >> hbase.regionserver.hlog.tolerable.lowreplication >> > to 1. Set up a keystore and enabled WAL encryption. >> > >> > 4. Created a test table. >> > >> > 5. Used YCSB to write 1000 rows to the test table. No flushes observed. >> > >> > 6. Used the shell to count the number of records in the test table. >> Count = >> > 1000 rows >> > >> > 7. kill -9 the regionserver process. >> > >> > 8. Started a new regionserver process. Observed log splitting and replay >> in >> > the regionserver log, no errors. >> > >> > 9. Used the shell to count the number of records in the test table. >> Count = >> > 1000 rows >> > >> > Tried this a few times. >> > >> > Shankar, can you try running through the above and let us know if the >> > outcome is different? >> > >> > >> > >> > On Sat, Jul 26, 2014 at 8:54 AM, Andrew Purtell < >> andrew.purtell@gmail.com> >> > wrote: >> > >> > > Thanks for the detail. So to summarize: >> > > >> > > 0. HBase 0.98.3 and HDFS 2.4.1 >> > > >> > > 1. All data before failure has not yet been flushed so only exists in >> the >> > > WAL files. >> > > >> > > 2. During distributed splitting, the WAL has either not been written >> out >> > > or is unreadable: >> > > >> > > >> > > 2014-07-26 19:29:16,160 ERROR [RS_LOG_REPLAY_OPS-host1:60020-0] >> > > codec.BaseDecoder: Partial cell read caused by EOF: >> java.io.IOException: >> > > Premature EOF from inputStream >> > > >> > > >> > > 3. This file is still moved to oldWALs even though splitting failed. >> > > >> > > 4. Setting 'hbase.regionserver.wal.encryption' to false allows for data >> > > recovery in your scenario. >> > > >> > > See https://issues.apache.org/jira/browse/HBASE-11595 >> > > >> > > >> > > >> > > >> > > On Jul 26, 2014, at 6:50 AM, Shankar hiremath < >> > shankar.hiremath@huawei.com> >> > > wrote: >> > > >> > > >> > > Hi Andrew, >> > > >> > > >> > > Please find the details >> > > >> > > >> > > Hbase 0.98.3 & hadoop 2.4.1 >> > > >> > > Hbase root file system on hdfs >> > > >> > > >> > > On Hmaster side there is no failure or error message in the log file >> > > >> > > On Region Server side the the below error message reported as below >> > > >> > > >> > > Region Server Log: >> > > >> > > 2014-07-26 19:29:15,904 DEBUG >> [regionserver60020-SendThread(host2:2181)] >> > > zookeeper.ClientCnxn: Reading reply sessionid:0x1476d8c83e5012c, >> packet:: >> > > clientPath:null serverPath:null finished:false header:: 172,4 >> > > replyHeader:: 172,4294988825,0 request:: '/hbase/table/hbase:acl,F >> > > response:: >> > > >> > >> #ffffffff000146d61737465723a36303030303372ffffffeb39ffffffbbf15ffffffc15042554680,s{4294967476,4294967480,1406293600844,1406293601414,2,0,0,0,31,0,4294967476} >> > > >> > > 2014-07-26 19:29:15,905 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-0] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-0,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,905 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-1] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-1,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,905 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-2] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-2,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,906 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-3] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-3,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,906 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-4] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-4,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,906 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-5] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-5,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,906 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-6] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-6,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,906 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-7] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-7,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,906 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-8] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-8,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,907 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-9] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-9,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,907 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-10] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-10,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,907 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-11] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-11,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,907 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-12] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-12,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,907 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-13] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-13,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,907 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-14] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-14,5,main]: starting >> > > >> > > 2014-07-26 19:29:15,907 DEBUG >> [RS_LOG_REPLAY_OPS-host1:60020-0-Writer-15] >> > > wal.HLogSplitter: Writer thread >> > > Thread[RS_LOG_REPLAY_OPS-host1:60020-0-Writer-15,5,main]: starting >> > > >> > > >> > > 2014-07-26 19:29:16,160 ERROR [RS_LOG_REPLAY_OPS-host1:60020-0] >> > > codec.BaseDecoder: Partial cell read caused by EOF: >> java.io.IOException: >> > > Premature EOF from inputStream >> > > >> > > >> > > 2014-07-26 19:29:16,161 INFO [RS_LOG_REPLAY_OPS-host1:60020-0] >> > > wal.HLogSplitter: Finishing writing output logs and closing down. >> > > >> > > 2014-07-26 19:29:16,161 INFO [RS_LOG_REPLAY_OPS-host1:60020-0] >> > > wal.HLogSplitter: Waiting for split writer threads to finish >> > > >> > > 2014-07-26 19:29:16,161 INFO [RS_LOG_REPLAY_OPS-host1:60020-0] >> > > wal.HLogSplitter: Split writers finished >> > > >> > > 2014-07-26 19:29:16,162 INFO [RS_LOG_REPLAY_OPS-host1:60020-0] >> > > wal.HLogSplitter: Processed 0 edits across 0 regions; log >> > > >> > >> file=hdfs://hacluster/hbase/WALs/host1,60020,1406383007151-splitting/host1%2C60020%2C1406383007151.1406383069334.meta >> > > is corrupted = false progress failed = false >> > > >> > > 2014-07-26 19:29:16,184 DEBUG >> [regionserver60020-SendThread(host2:2181)] >> > > zookeeper.ClientCnxn: Got notification sessionid:0x1476d8c83e5012c >> > > >> > > >> > > >> > > When I query the table data, which was in WAL files(before the >> > > RegionServer machine went down) is not coming, >> > > >> > > One more thing what I observed is even when the WAL file not >> successfully >> > > processed then also it is moving to /oldWALs folder. >> > > >> > > So when I revert back the below 3 configuration in Region Server side >> and >> > > restart, since the WAL is already moved to oldWALS/ folder, >> > > >> > > So it will not get processed. >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.reader.impl >> > > >> > > >> > > >> > >> org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.writer.impl >> > > >> > > >> > > >> > >> org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.wal.encryption >> > > >> > > true >> > > >> > > >> > > >> > > >> > > >> > > >> > >> ------------------------------------------------------------------------------------------------------------- >> > > >> > > >> > > And one more scenario I tried (Anoop suggested), with the below >> > > configuration (instead of deleting the below 3 config paramters >> > > >> > > Kepp all but make only 'hbase.regionserver.wal.encryption=false') the >> > > encrypted wal file is getting processed >> > > >> > > Successfully, and the query table is giving the WAL data (before the >> > > RegionServer machine went down) correctly. >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.reader.impl >> > > >> > > >> > > >> > >> org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.writer.impl >> > > >> > > >> > > >> > >> org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.wal.encryption >> > > >> > > false >> > > >> > > >> > > >> > > >> > > >> > > Regards >> > > >> > > -Shankar >> > > >> > > >> > > This e-mail and its attachments contain confidential information from >> > > HUAWEI, which is intended only for the person or entity whose address >> is >> > > listed above. Any use of the information contained herein in any way >> > > (including, but not limited to, total or partial disclosure, >> > reproduction, >> > > or dissemination) by persons other than the intended recipient(s) is >> > > prohibited. If you receive this e-mail in error, please notify the >> sender >> > > by phone or email immediately and delete it! >> > > >> > > >> > > >> > > >> > > >> > > >> > > -----Original Message----- >> > > >> > > From: andrew.purtell@gmail.com [mailto:andrew.purtell@gmail.com >> > > ] On Behalf Of Andrew Purtell >> > > >> > > Sent: 26 July 2014 AM 02:21 >> > > >> > > To: user@hbase.apache.org >> > > >> > > Subject: Re: HBase file encryption, inconsistencies observed and data >> > loss >> > > >> > > >> > > Encryption (or the lack of it) doesn't explain missing HFiles. >> > > >> > > >> > > Most likely if you are having a problem with encryption, this will >> > > manifest as follows: HFiles will be present. However, you will find >> many >> > > IOExceptions in the regionserver logs as they attempt to open the >> HFiles >> > > but fail because the data is unreadable. >> > > >> > > >> > > We should start by looking at more basic issues. What could explain the >> > > total disappearance of HFiles. >> > > >> > > >> > > Is the HBase root filesystem on HDFS (fs URL starts with hdfs://) or on >> > > the local filesystem (fs URL starts with file://)? >> > > >> > > >> > > In your email you provide only exceptions printed by the client. What >> > kind >> > > of exceptions appear in the regionserver logs? Or appear in the master >> > log? >> > > >> > > If the logs are large your best bet is to pastebin them and then send >> the >> > > URL to the paste in your response. >> > > >> > > >> > > >> > > >> > > >> > > On Fri, Jul 25, 2014 at 7:08 AM, Shankar hiremath < >> > > shankar.hiremath@huawei.com> wrote: >> > > >> > > >> > > HBase file encryption some inconsistencies observed and data loss >> > > >> > > happens after running the hbck tool, >> > > >> > > the operation steps are as below. (one thing what I observed is, on >> > > >> > > startup of HMaster if it is not able to process the WAL file, then >> > > >> > > also it moved to /oldWALs) >> > > >> > > >> > > Procedure: >> > > >> > > 1. Start the Hbase services (HMaster & region Server) 2. Enable HFile >> > > >> > > encryption and WAL file encryption as below, and perform 'table4-0' >> > > >> > > put operations (100 records added) >> > > >> > > hbase.crypto.keyprovider >> > > >> > > org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider >> > > >> > > >> > > >> > > >> > > >> > > hbase.crypto.keyprovider.parameters >> > > >> > > jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > hbase.crypto.master.key.name >> > > >> > > hdfs >> > > >> > > >> > > >> > > >> > > >> > > hfile.format.version >> > > >> > > 3 >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.reader.impl >> > > >> > > >> > > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReade >> > > >> > > r >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.writer.impl >> > > >> > > >> > > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWrite >> > > >> > > r >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.wal.encryption >> > > >> > > true >> > > >> > > >> > > >> > > 3. Machine went down, so all process went down >> > > >> > > >> > > 4. We disabled the WAL file encryption for performance reason, and >> > > >> > > keep encryption only for Hfile, as below >> > > >> > > hbase.crypto.keyprovider >> > > >> > > org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider >> > > >> > > >> > > >> > > >> > > >> > > hbase.crypto.keyprovider.parameters >> > > >> > > jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > hbase.crypto.master.key.name >> > > >> > > hdfs >> > > >> > > >> > > >> > > >> > > >> > > hfile.format.version >> > > >> > > 3 >> > > >> > > >> > > >> > > 5. Start the Region Server and query the 'table4-0' data >> > > >> > > hbase(main):003:0> count 'table4-0' >> > > >> > > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region >> > > >> > > table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not >> > > >> > > online on >> > > >> > > XX-XX-XX-XX,60020,1406209023146 >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedN >> > > >> > > ame(HRegionServer.java:2685) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSe >> > > >> > > rver.java:4119) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer. >> > > >> > > java:3066) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$ >> > > >> > > 2.callBlockingMethod(ClientProtos.java:29497) >> > > >> > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) >> > > >> > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcS >> > > >> > > cheduler.java:168) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcSch >> > > >> > > eduler.java:39) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcSchedule >> > > >> > > r.java:111) at java.lang.Thread.run(Thread.java:662) >> > > >> > > 6. Not able to read the data, so we decided to revert back the >> > > >> > > configuration (as original) 7. Kill/Stop the Region Server, revert all >> > > >> > > the configurations as original, as below >> > > >> > > hbase.crypto.keyprovider >> > > >> > > org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider >> > > >> > > >> > > >> > > >> > > >> > > hbase.crypto.keyprovider.parameters >> > > >> > > jceks:///opt/shankar1/kdc_keytab/hbase.jks?password=Hadoop@234 >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > hbase.crypto.master.key.name >> > > >> > > hdfs >> > > >> > > >> > > >> > > >> > > >> > > hfile.format.version >> > > >> > > 3 >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.reader.impl >> > > >> > > >> > > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReade >> > > >> > > r >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.hlog.writer.impl >> > > >> > > >> > > org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWrite >> > > >> > > r >> > > >> > > >> > > >> > > >> > > >> > > hbase.regionserver.wal.encryption >> > > >> > > true >> > > >> > > >> > > >> > > 7. Start the Region Server, and perform the 'table4-0' query >> > > >> > > hbase(main):003:0> count 'table4-0' >> > > >> > > ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region >> > > >> > > table4-0,,1406207815456.fc10620a3dcc14e004ab034420f7d332. is not >> > > >> > > online on >> > > >> > > XX-XX-XX-XX,60020,1406209023146 >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedN >> > > >> > > ame(HRegionServer.java:2685) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSe >> > > >> > > rver.java:4119) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer. >> > > >> > > java:3066) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$ >> > > >> > > 2.callBlockingMethod(ClientProtos.java:29497) >> > > >> > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2084) >> > > >> > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcS >> > > >> > > cheduler.java:168) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcSch >> > > >> > > eduler.java:39) >> > > >> > > at >> > > >> > > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcSchedule >> > > >> > > r.java:111) at java.lang.Thread.run(Thread.java:662) >> > > >> > > 8. Run the hbase hbck to repair, as below ./hbase hbck -details >> > > >> > > ......................... >> > > >> > > Summary: >> > > >> > > table1-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table2-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table3-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table4-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table5-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table6-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table7-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table8-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > table9-0 is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > hbase:meta is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > hbase:namespace is okay. >> > > >> > > Number of regions: 0 >> > > >> > > Deployed on: >> > > >> > > 22 inconsistencies detected. >> > > >> > > Status: INCONSISTENT >> > > >> > > 2014-07-24 19:13:05,532 INFO [main] >> > > >> > > client.HConnectionManager$HConnectionImplementation: Closing master >> > > >> > > protocol: MasterService >> > > >> > > 2014-07-24 19:13:05,533 INFO [main] >> > > >> > > client.HConnectionManager$HConnectionImplementation: Closing zookeeper >> > > >> > > sessionid=0x1475d1611611bcf >> > > >> > > 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ZooKeeper: Closing >> > session: >> > > >> > > 0x1475d1611611bcf >> > > >> > > 2014-07-24 19:13:05,533 DEBUG [main] zookeeper.ClientCnxn: Closing >> > > >> > > client for session: 0x1475d1611611bcf >> > > >> > > 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] >> > > >> > > zookeeper.ClientCnxn: Reading reply sessionid:0x1475d1611611bcf, >> packet:: >> > > >> > > clientPath:null serverPath:null finished:false header:: 6,-11 >> > replyHeader:: >> > > >> > > 6,4295102074,0 request:: null response:: null >> > > >> > > 2014-07-24 19:13:05,546 DEBUG [main] zookeeper.ClientCnxn: >> > > >> > > Disconnecting client for session: 0x1475d1611611bcf >> > > >> > > 2014-07-24 19:13:05,546 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] >> > > >> > > zookeeper.ClientCnxn: An exception was thrown while closing send >> > > >> > > thread for session 0x1475d1611611bcf : Unable to read additional data >> > > >> > > from server sessionid 0x1475d1611611bcf, likely server has closed >> > > >> > > socket >> > > >> > > 2014-07-24 19:13:05,546 INFO [main-EventThread] zookeeper.ClientCnxn: >> > > >> > > EventThread shut down >> > > >> > > 2014-07-24 19:13:05,546 INFO [main] zookeeper.ZooKeeper: Session: >> > > >> > > 0x1475d1611611bcf closed >> > > >> > > shankar1@XX-XX-XX-XX:~/DataSight/hbase/bin> >> > > >> > > 9. Fix the assignments as below >> > > >> > > ./hbase hbck -fixAssignments >> > > >> > > Summary: >> > > >> > > table1-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table2-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table3-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table4-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table5-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table6-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table7-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table8-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table9-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:meta is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:namespace is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > 0 inconsistencies detected. >> > > >> > > Status: OK >> > > >> > > 2014-07-24 19:44:55,194 INFO [main] >> > > >> > > client.HConnectionManager$HConnectionImplementation: Closing master >> > > >> > > protocol: MasterService >> > > >> > > 2014-07-24 19:44:55,194 INFO [main] >> > > >> > > client.HConnectionManager$HConnectionImplementation: Closing zookeeper >> > > >> > > sessionid=0x2475d15f7b31b73 >> > > >> > > 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ZooKeeper: Closing >> > session: >> > > >> > > 0x2475d15f7b31b73 >> > > >> > > 2014-07-24 19:44:55,194 DEBUG [main] zookeeper.ClientCnxn: Closing >> > > >> > > client for session: 0x2475d15f7b31b73 >> > > >> > > 2014-07-24 19:44:55,203 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] >> > > >> > > zookeeper.ClientCnxn: Reading reply sessionid:0x2475d15f7b31b73, >> packet:: >> > > >> > > clientPath:null serverPath:null finished:false header:: 7,-11 >> > replyHeader:: >> > > >> > > 7,4295102377,0 request:: null response:: null >> > > >> > > 2014-07-24 19:44:55,203 DEBUG [main] zookeeper.ClientCnxn: >> > > >> > > Disconnecting client for session: 0x2475d15f7b31b73 >> > > >> > > 2014-07-24 19:44:55,204 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] >> > > >> > > zookeeper.ClientCnxn: An exception was thrown while closing send >> > > >> > > thread for session 0x2475d15f7b31b73 : Unable to read additional data >> > > >> > > from server sessionid 0x2475d15f7b31b73, likely server has closed >> > > >> > > socket >> > > >> > > 2014-07-24 19:44:55,204 INFO [main] zookeeper.ZooKeeper: Session: >> > > >> > > 0x2475d15f7b31b73 closed >> > > >> > > 2014-07-24 19:44:55,204 INFO [main-EventThread] zookeeper.ClientCnxn: >> > > >> > > EventThread shut down >> > > >> > > 10. Fix the assignments as below >> > > >> > > ./hbase hbck -fixAssignments -fixMeta >> > > >> > > Summary: >> > > >> > > table1-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table2-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table3-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table4-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table5-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table6-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table7-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table8-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > table9-0 is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:meta is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:acl is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 hbase:namespace is okay. >> > > >> > > Number of regions: 1 >> > > >> > > Deployed on: XX-XX-XX-XX,60020,1406209023146 >> > > >> > > 0 inconsistencies detected. >> > > >> > > Status: OK >> > > >> > > 2014-07-24 19:46:16,290 INFO [main] >> > > >> > > client.HConnectionManager$HConnectionImplementation: Closing master >> > > >> > > protocol: MasterService >> > > >> > > 2014-07-24 19:46:16,290 INFO [main] >> > > >> > > client.HConnectionManager$HConnectionImplementation: Closing zookeeper >> > > >> > > sessionid=0x3475d1605321be9 >> > > >> > > 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ZooKeeper: Closing >> > session: >> > > >> > > 0x3475d1605321be9 >> > > >> > > 2014-07-24 19:46:16,290 DEBUG [main] zookeeper.ClientCnxn: Closing >> > > >> > > client for session: 0x3475d1605321be9 >> > > >> > > 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] >> > > >> > > zookeeper.ClientCnxn: Reading reply sessionid:0x3475d1605321be9, >> packet:: >> > > >> > > clientPath:null serverPath:null finished:false header:: 6,-11 >> > replyHeader:: >> > > >> > > 6,4295102397,0 request:: null response:: null >> > > >> > > 2014-07-24 19:46:16,300 DEBUG [main] zookeeper.ClientCnxn: >> > > >> > > Disconnecting client for session: 0x3475d1605321be9 >> > > >> > > 2014-07-24 19:46:16,300 DEBUG [main-SendThread(XX-XX-XX-XX:2181)] >> > > >> > > zookeeper.ClientCnxn: An exception was thrown while closing send >> > > >> > > thread for session 0x3475d1605321be9 : Unable to read additional data >> > > >> > > from server sessionid 0x3475d1605321be9, likely server has closed >> > > >> > > socket >> > > >> > > 2014-07-24 19:46:16,300 INFO [main] zookeeper.ZooKeeper: Session: >> > > >> > > 0x3475d1605321be9 closed >> > > >> > > 2014-07-24 19:46:16,300 INFO [main-EventThread] zookeeper.ClientCnxn: >> > > >> > > EventThread shut down >> > > >> > > hbase(main):006:0> count 'table4-0' >> > > >> > > 0 row(s) in 0.0200 seconds >> > > >> > > => 0 >> > > >> > > hbase(main):007:0> >> > > >> > > Complete data loss happened, >> > > >> > > WALs, oldWALs & /hbase/data/default/table4-0/ does not have any data >> > > >> > > >> > > >> > > >> > > [X] >> > > >> > > This e-mail and its attachments contain confidential information from >> > > >> > > HUAWEI, which is intended only for the person or entity whose address >> > > >> > > is listed above. Any use of the information contained herein in any >> > > >> > > way (including, but not limited to, total or partial disclosure, >> > > >> > > reproduction, or dissemination) by persons other than the intended >> > > >> > > recipient(s) is prohibited. If you receive this e-mail in error, >> > > >> > > please notify the sender by phone or email immediately and delete it! >> > > >> > > [X] >> > > >> > > >> > > >> > > -- >> > > >> > > Best regards, >> > > >> > > >> > > - Andy >> > > >> > > >> > > Problems worthy of attack prove their worth by hitting back. - Piet >> Hein >> > > (via Tom White) >> > > >> > > >> > >> > --001a11c33aa8ad555704ff2b4fd8--