hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhangshibin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17822) Set maxBadRows and outputDirectory option for VerifyReplication
Date Thu, 23 Mar 2017 08:45:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937937#comment-15937937
] 

zhangshibin commented on HBASE-17822:
-------------------------------------

the maxbadrows and outputfile is optional and configurable,use like this:    --outputfile=/test
--maxbadrows=10000
yeah,we need to find all inconsistent rows,the default value of badrows can  set as  unlimited.
this option may prevent output long log ,or we use this option to do batch revise inconsistent.

the intent is to  find  and revise inconsistent row,but output to job log mixed with other
runtime log is too scattered.
it maybe a neatly and easy way to output all inconsistent rowkey in a file.
So ,if we set  outputfile option we will get the file .
if not ,we still use original way.

> Set maxBadRows and outputDirectory  option for VerifyReplication
> ----------------------------------------------------------------
>
>                 Key: HBASE-17822
>                 URL: https://issues.apache.org/jira/browse/HBASE-17822
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: zhangshibin
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17822-master.patch
>
>
> Currently,this tool will print too many rowkey as badrows  if source and peer table have
many inconsistent row.So,it is necessay to set maxBadRows to print.
> Also,look for badrows rowkey is inconvenient  in MR job log .It might be useful to set
a reduce to aggregate badrowkeys which will be print in MR job output file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message