hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7704) migration tool that checks presence of HFile V1 files
Date Wed, 03 Apr 2013 18:47:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621166#comment-13621166

stack commented on HBASE-7704:

Here is more on how I think this would work (if anyone is listening).

We want a script that we can run that goes through all the hfiles in an hbase install and
returns code 0 if no v1 files found and non-zero if any v1 files are found (It should fail
out as soon as it finds a v1 file).

This script can be run while an hbase cluster is up.

Script should be called something like v1Free or noV1HFiles

Script can be in java or jruby since a one-off just needed to check an hbase install BEFORE
we do an upgrade.

Script should implement http://hadoop.apache.org/docs/r2.0.3-alpha/api/org/apache/hadoop/util/Tool.html
so it can pick up configuration to find hbase to run against (I suppose this means it is a
java script).

Ideally, script should be done in a manner such that if we need to use it in a mapreduce job,
it'd be easy to do (we do not need the mapreduce job as part of this JIRA I would say).  I
think this means that in the script there is a method which takes fully qualified hfile Path
and returns true/false.

Would suggest that script have an executor service and take on the command line how many concurrent
threads to run w/ a reasonable default so that the checking is done in parallel.

So, the script would walk the hbase.rootdir looking for hfiles.  Look at hbase file utils
because will need to skip over special files and directories.  Per file found, it would read
in its metadata and check for v1.  See hfile for how it finds metadata at end of file.

> migration tool that checks presence of HFile V1 files
> -----------------------------------------------------
>                 Key: HBASE-7704
>                 URL: https://issues.apache.org/jira/browse/HBASE-7704
>             Project: HBase
>          Issue Type: Task
>            Reporter: Ted Yu
>            Priority: Blocker
>             Fix For: 0.95.1
> Below was Stack's comment from HBASE-7660:
> Regards the migration 'tool', or 'tool' to check for presence of v1 files, I imagine
it as an addition to the hfile tool http://hbase.apache.org/book.html#hfile_tool2 The hfile
tool already takes a bunch of args including printing out meta. We could add an option to
print out version only – or return 1 if version 1 or some such – and then do a bit of
code to just list all hfiles and run this script against each. Could MR it if too many files.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message