hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Himanshu Vashishtha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7704) migration tool that checks presence of HFile V1 files
Date Thu, 04 Apr 2013 18:58:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622660#comment-13622660

Himanshu Vashishtha commented on HBASE-7704:

Thanks for the reviews guys:
bq. Would suggest you add to the class comment how to run it.
Will do.
bq. Does "./bin/hbase org.apache.hadoop.hbase.util.HFileV1Detector --h" work?
bq. Suggest you show us what '-h' output looks like here in this issue:
./bin/hbase org.apache.hadoop.hbase.util.HFileV1Detector -help
usage: HFileV1Detector [-h] [-n <arg>] [-p <arg>]
 -h,--help                    Help
 -n,--numberOfThreads <arg>   Number of threads
 -p,--path <arg>              Path to table
In case no option is provided, it process hbase.rootdir with 10 threads.
The help section is printed with -h, --h, -help, and --help

bq. Suggest it return -1 if hfilev1 files found: + return 0;
The goal of this tool is to tell the user which regions have hfilev1 in them. It prints out
those regions (currently, it prints full path to print the table name).

bq. What will you ignore?
Yes, .logs, etc. Basically, all the non-table directories.

bq. You keep saying you are going to print regions that have v1 files but you seem to be printing
out full path
Yes, its intentional as to let the user know which table this region belongs. IMO, it is good
to know.

bq. Suggest an option that will fast fail... fail as soon as it finds the first v1. This is
probably not important so if it takes a while, just punt.
bq. It looks like you do fail fast – you stop scanning a family as soon as you find a v1
Yea, I designed it with the following behavior in mind:
1) Scan a table one at a time. This way we can give a table clean chit if no hfilev1 is found
inside it.
2) Scan regions in parallel. Here, the executor comes in. Basically, scanning a region is
a task. If a hfile is found in any of the CF, then there is no need to scan other families
as we would like the user to compact that region anyway.

bq. Why not include original exception here: + throw new IOException("Unknown version for
hfile: " + storeFilePath);?
Will do what sergey said.

The current output is:
Table hdfs://localhost:41020/hbase-0.94/-ROOT- has no HFileV1.
Found a v1 hfile, hdfs://localhost:41020/hbase-0.94/t/c6b79b9f1ca4a37921355ddbfb521761/f/2811264815153459761
Region has a hfile v1: hdfs://localhost:41020/hbase-0.94/t/c6b79b9f1ca4a37921355ddbfb521761
Table hdfs://localhost:41020/hbase-0.94/t has 1 number of HFileV1.

==================Regions to Major Compact==============


===========End of Regions to Major Compact==============

Total  number of HFile V1 is: 1

I will add a section to print out all the hfile v1, and remove extra messaging and fixed nits
suggested by Sergey and paste the output.

> migration tool that checks presence of HFile V1 files
> -----------------------------------------------------
>                 Key: HBASE-7704
>                 URL: https://issues.apache.org/jira/browse/HBASE-7704
>             Project: HBase
>          Issue Type: Task
>            Reporter: Ted Yu
>            Assignee: Himanshu Vashishtha
>            Priority: Blocker
>             Fix For: 0.95.1
>         Attachments: HBase-7704-v1.patch
> Below was Stack's comment from HBASE-7660:
> Regards the migration 'tool', or 'tool' to check for presence of v1 files, I imagine
it as an addition to the hfile tool http://hbase.apache.org/book.html#hfile_tool2 The hfile
tool already takes a bunch of args including printing out meta. We could add an option to
print out version only – or return 1 if version 1 or some such – and then do a bit of
code to just list all hfiles and run this script against each. Could MR it if too many files.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message