hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject New file format migration
Date Fri, 13 Feb 2009 23:13:09 GMT
Moving to the new file format (see hbase-61), I used to think that we could
run the regionserver with readers and writers for both the old and new and
that as we went, we'd rewrite old file format into the new on compaction.
To facilitate the lazy migration, we would spend time extracting a
lowest-common denominator interface and then write implementations and glue
code for mapfile and the new hfile.

Now, starting the hfile integration effort, I see that lazy migration would
force us to give up some of the performance benefits hfile brings.  Whole
sections of the Store and StoreFile code are synchronized to allow the
mapfile iterate undisturbed without the inteference of any another
concurrent access.  This blocking is not necessary with hfile (internally it
maintains thread-safe scanning).  Writing two code paths, one for the old
format and one for the new would complicate the server considerably, delay
the release, etc.

I now am tending toward a fat migration that major compacts old stores and
as it runs, writes out new files as hfiles.  We'd do this as a distinct
mapreduce job or add it into regionserver startup -- basically, on open,
migrate the individual regions.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message