hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Marc Spaggiari (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-11715) HBase should provide a tool to compare 2 remote tables.
Date Sun, 10 Aug 2014 11:17:11 GMT
Jean-Marc Spaggiari created HBASE-11715:
-------------------------------------------

             Summary: HBase should provide a tool to compare 2 remote tables.
                 Key: HBASE-11715
                 URL: https://issues.apache.org/jira/browse/HBASE-11715
             Project: HBase
          Issue Type: Bug
            Reporter: Jean-Marc Spaggiari


As discussed in the mailing list, when a table is copied to another cluster and need to be
validated against the first one, only VerifyReplication can be used. However, this can be
very long since data need to be copied again.

We should provide an easier and faster way to compare the tables. 

One option is to calculate hashs per ranges. User can define number of buckets, then we split
the table into this number of buckets and calculate an hash for each (Like partitioner is
already doing). We can also optionally calculate an overall CRC to reduce even more hash collision.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message