hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5140) TableInputFormat subclass to allow N number of splits per region during MR jobs
Date Tue, 10 Jan 2012 07:08:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183119#comment-13183119
] 

Hadoop QA commented on HBASE-5140:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510012/Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch.1
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -151 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit
warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/713//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/713//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/713//console

This message is automatically generated.
                
> TableInputFormat subclass to allow N number of splits per region during MR jobs
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-5140
>                 URL: https://issues.apache.org/jira/browse/HBASE-5140
>             Project: HBase
>          Issue Type: New Feature
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Josh Wymer
>            Priority: Trivial
>              Labels: mapreduce, split
>             Fix For: 0.90.4
>
>         Attachments: Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch,
Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch.1, Added_functionality_to_split_n_times_per_region_on_mapreduce_jobs.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In regards to [HBASE-5138|https://issues.apache.org/jira/browse/HBASE-5138] I am working
on a subclass for the TableInputFormat class that overrides getSplits in order to generate
N number of splits per regions and/or N number of splits per job. The idea is to convert the
startKey and endKey for each region from byte[] to BigDecimal, take the difference, divide
by N, convert back to byte[] and generate splits on the resulting values. Assuming your keys
are fully distributed this should generate splits at nearly the same number of rows per split.
Any suggestions on this issue are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message