hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maddineni Sukumar (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables
Date Thu, 27 Apr 2017 06:42:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986079#comment-15986079
] 

Maddineni Sukumar edited comment on HBASE-16466 at 4/27/17 6:41 AM:
--------------------------------------------------------------------

Ted, I have below perf numbers as of now. Will get numbers on load impact. 

I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 16. 

Rows               NormalApproach        SnapshotsApproach
---------------------------------------------------------------------
1million            1min16sec	             36sec
10million	         6min15sec	            1min13sec
500million	 5hours20mins	     8mins40secs

With snapshots I am able to complete VerifyReplication job in 8 minutes instead of 5 hours
using normal table scan approach. 


was (Author: sukunaidu@gmail.com):
Ted, I have below perf numbers as of now. Will get numbers on load impact. 

I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 16. 

Rows               Normal Approach        Snapshots approach
---------------------------------------------------------------------
1million            1min16sec	             36sec
10million	         6min15sec	            1min13sec
500million	 5hours20mins	     8mins40secs

With snapshots I am able to complete VerifyReplication job in 8 minutes instead of 5 hours
using normal table scan approach. 

> HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster
with large tables
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16466
>                 URL: https://issues.apache.org/jira/browse/HBASE-16466
>             Project: HBase
>          Issue Type: Improvement
>          Components: hbase
>    Affects Versions: 0.98.21
>            Reporter: Sukumar Maddineni
>            Assignee: Maddineni Sukumar
>             Fix For: 1.3.1
>
>         Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If you  want
to run VerifyReplication multiple times on a production live cluster with large tables then
it creates extra load on HBase layer. So if we implement snapshot based support then both
in source and target we can read data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message