hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14442) MultiTableInputFormatBase.getSplits dosenot build split for a scan whose startRow=stopRow=(startRow of a region)
Date Mon, 21 Sep 2015 11:15:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900513#comment-14900513
] 

Nathan commented on HBASE-14442:
--------------------------------

Can I just add a unit test in the TestMultiTableInputFormat.java for getSplit without start
a MR job, like this:

    @Test
    public void testScanA5ToA5Splits() throws IOException, InterruptedException, ClassNotFoundException
{
        String start = "aaaaa";
        String stop = "aaaaa";
        String jobName = "ScanA5ToA5Splits";
        Configuration c = new Configuration(TEST_UTIL.getConfiguration());

        List<Scan> scans = new ArrayList<Scan>();
        for (String tableName : TABLES) {
            Scan scan = new Scan();

            scan.addFamily(INPUT_FAMILY);
            scan.setAttribute(Scan.SCAN_ATTRIBUTES_TABLE_NAME, Bytes.toBytes(tableName));

            if (start != null) {
                scan.setStartRow(Bytes.toBytes(start));
            }
            if (stop != null) {
                scan.setStopRow(Bytes.toBytes(stop));
            }
            scans.add(scan);

            LOG.info("scan before: " + scan);
        }
        Job job = new Job(c, jobName);

        initJob(scans, job);
        job.setReducerClass(ScanReducer.class);
        job.setNumReduceTasks(1); // one to get final "first" and "last" key
        FileOutputFormat.setOutputPath(job, new Path(job.getJobName()));
        initJob(scans, job);
        MultiTableInputFormat format = new MultiTableInputFormat();
      
        //as the startRow is equals stopRow (in the same region), the splits' size should
equal the scanList size
        Assert.assertEquals(format.getSplits(job).size(), scans.size());
    }

> MultiTableInputFormatBase.getSplits dosenot build split for a scan whose startRow=stopRow=(startRow
of a region)
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14442
>                 URL: https://issues.apache.org/jira/browse/HBASE-14442
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 1.1.2
>            Reporter: Nathan
>            Assignee: Nathan
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> I created a Scan whose startRow and stopRow are the same with a region's startRow, then
I found no map was built. 
> The following is the source code of this condtion:
> (startRow.length == 0 || keys.getSecond()[i].length == 0 ||
>                     Bytes.compareTo(startRow, keys.getSecond()[i]) < 0) &&
>                     (stopRow.length == 0 || Bytes.compareTo(stopRow,
>                             keys.getFirst()[i]) > 0)
> I think  a "=" should be added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message