hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chunhui Shen" <zju...@163.com>
Subject Re:Re: [RESULT] ANN: The third hbase-0.96.1 release candidate is available for download
Date Wed, 18 Dec 2013 01:53:43 GMT
About the online merge:


HBCK will report an error now after the online merge,
because the files of merging regions still remain on HDFS which will be cleaned by CatalogJanitor
later.


In the merge process, we create file references instead of moving files together because the
latter will break Table Snapshot.
Thus, we couldn't remove these files until the merged region complete compaction.


Thanks for the feedback.


I will enhance HBCK to handle this case.















At 2013-12-18 03:21:42,"Jean-Marc Spaggiari" <jean-marc@spaggiari.org> wrote:
>So. Some feedback.
>
>0.94.x give "Status: OK" in HBCK.
>
>Did a distcp between the 2 clusters, removed splitlog since I'm not able to
>change the owner to my HBase user, did the upgrade, started.
>
>I can see all my tables correctly, able to scan them.
>
>HBCK reports all the tables as okay, even the hbase:meta table, however,
>I'm getting this:
>"ERROR: Empty REGIONINFO_QUALIFIER found in hbase:meta"
>
>Ran hbck with -fixEmptyMetaCells
>Reran it. All clear now.
>
>Now, I played with the online merge, and I'm still getting errors but they
>seems to just be bad timing.
>
>tl;tr; jump to the arrow below.
>
>There is initially 4 regions in the table. I merge the 2 first one
>together. That creates a 3 region table. I merge again the 2 first one
>together. I wait few minutes, and I run HBCK.
>
>ERROR: Region { meta => null, hdfs =>
>hdfs://hbasetest1:9000/hbase/data/default/dns/c6569a72cc3c2750d14976ab85f02315,
>deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any
>region server
>ERROR: Region { meta => null, hdfs =>
>hdfs://hbasetest1:9000/hbase/data/default/dns/efa630782e1d603fbc239a11ab292957,
>deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any
>region server
>
>I merged those 4 regions:
>merge_region 'bb65f685cdefc4f2491d246f376fc1f0',
>'d02ce8e3fa1a200c7f034b349acf8cc8'
>merge_region 'efa630782e1d603fbc239a11ab292957',
>'c6569a72cc3c2750d14976ab85f02315'
>
>And here is the HDFS content after the merge:
>drwxr-xr-x   - hbase hbase          0 2013-12-17 13:35
>/hbase/data/default/dns/c6569a72cc3c2750d14976ab85f02315
>drwxr-xr-x   - hbase hbase          0 2013-12-17 13:35
>/hbase/data/default/dns/d5b74aaa2853b00b0ad0f20f60c74398
>drwxr-xr-x   - hbase hbase          0 2013-12-17 13:46
>/hbase/data/default/dns/efa630782e1d603fbc239a11ab292957
>drwxr-xr-x   - hbase hbase          0 2013-12-17 13:46
>/hbase/data/default/dns/f2e0764d4e9dea8bfc0aeed9da3da5f7
>
>And the table in the WebUI:
>dns,,1387305985379.f2e0764d4e9dea8bfc0aeed9da3da5f7.
>dns,theafronews.ca,1379202071281.d5b74aaa2853b00b0ad0f20f60c74398.
>
>Regions efa630782e1d603fbc239a11ab292957 and
>c6569a72cc3c2750d14976ab85f02315 should not be there anymore.
>
>Waiting even longer, they are now removed and hbck reports everything is
>correct.
>
>I know there is some people which are running hbck -repair as a cron job.
>If that occurs while the regions just got merged, it might re-create the
>entries in the meta based on the hdfs content and they will have overlaps
>and duplicates
>
>===> So to summarize, seems that merge append pretty quickly, but it waits
>for the CatalogJanitor to remove the directories left over by the process.
>I think the merge process should remove those files and not rely on the
>catalog janitor. I did the test multiple times. First time took about 30
>seconds for the janitor to clear the paths. But the 2nd time it took 4
>minutes for the janitor to run and to clear the files...
>
>One last small thing. There is no more a split button in the WebUI. When
>you don't want to split based on a specific key, it's not trivial that you
>have to go into the empty field and press enter.
>
>JM
>
>2013/12/17 Stack <stack@duboce.net>
>
>> On Tue, Dec 17, 2013 at 7:38 AM, Jean-Marc Spaggiari <
>> jean-marc@spaggiari.org> wrote:
>>
>> > Sorry about that mates, I know I'm late. I was fighting against snappy
>> > codec for the last few days and was not able to correctly startup my
>> 0.96.1
>> > version.
>> >
>> > So since it's already over, I have done a reduce phase test.
>> > Verified the signature, checked the documentation and the CHANGES.txt
>> file.
>> > distcp 2TB from a 0.94.x/hadoop 1.0.3 cluster to 0.96.1/hadoop 2.2.0. Ran
>> > the migration tool.
>> > online merged an entire table to a single region.
>> >
>> >
>> Thank you JMS.
>>
>>
>>
>> > At the end of all of that I have some inconsistencies in the system
>> > reported by HBCK. (Extra regions, empty regioninfo_qualifier in the meta,
>> > etc.).
>> >
>> >
>> > I will redo all the steps I did one by one and run HBCK between each to
>> see
>> > where it failed and report what I found. Next step will be to enable
>> > replication between my 0.94 and my 0.96 clusters.
>> >
>>
>> That'd be really helpful.  Thanks.
>>
>> St.Ack
>>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message