hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
Date Sat, 03 Mar 2012 03:39:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221499#comment-13221499
] 

Jesse Yates commented on HBASE-5509:
------------------------------------

bq. The command options need to be documented better. In fact the argument parsing should
be improved too.

+1

bq. Generally: It o.a.h.h.backups the right place to put this? Do we want this in core HBase?

IMO, it should definitely be part of core. Think about the most common DBs, backup/snapshot
is part of the database, as opposed to some other tool that you get from somewhere else. We
can always break the paradigm, but it seems to fit in this case.

bq.1. do we want to this route at all 

I think this approach is pretty reasonable. To get 'real' snapshotting, we will obviously
have to do a bit more work, but this is the right approach to get there. Ideally, I should
just be able to hook up the region files to another cluster and be able to recover/rollback
to the previous state. This seems the safest and fastest, though debatable how much of either
and if its worth the work at the moment.

                
> MR based copier for copying HFiles (trunk version)
> --------------------------------------------------
>
>                 Key: HBASE-5509
>                 URL: https://issues.apache.org/jira/browse/HBASE-5509
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: 5509.txt
>
>
> This copier is a modification of the distcp tool in HDFS. It does the following:
> 1. List out all the regions in the HBase cluster for the required table
> 2. Write the above out to a file
> 3. Each mapper 
>    3.1 lists all the HFiles for a given region by querying the regionserver
>    3.2 copies all the HFiles
>    3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried
in another loop
> 4. Mappers are placed on nodes which have maximum locality for a given region to speed
up copying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message