hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Kennedy <james.kenn...@troove.net>
Subject Data upgrade from 0.89x to 0.90.0.
Date Fri, 11 Feb 2011 06:42:32 GMT
I've tested HBase 0.90 + HBase-trx 0.90.0 and i've run it over old data from 0.89x using a
variety of seeded unit test/QA data and cluster configurations.

But when it came time to upgrade some production data I got snagged on HBASE-3524. The gist
of it is in Ryan's last points:

* compaction is "optional", meaning if it fails no data is lost, so you
should probably be fine.

* Older versions of the code did not write out time tracker data and
that is why your older files were giving you NPEs.

Makes sense.  But why did I not encounter this with my initial data upgrades on very similar
data pkgs?

So I applied Ryan's patch, which simply assigns a default value (Long.MIN_VALUE) when a StoreFile
lacks a timeRangeTracker and I "fixed" the data by forcing major compactions on the regions
affected.  Preliminary poking has not shown any instability in the data since.

But I confess that I just don't have the time right now to really dig into the code and validate
that there are no more gotchya's or data corruption that could have resulted.

I guess the questions that I have for the team are:

* What state would 9 out of 50 tables be in to miss the new 0.90.0 timeRangeTracker injection
before the first major compaction check?
* Where else is the new TimeRangeTracker used?  Could a StoreFile with a null timeRangeTracker
have corrupted the data in other subtler ways?
* What other upgrade-related data changes might not have completed elsewhere?


James Kennedy
Project Manage
Troove Inc.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message