hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars George (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1385) Revamp TableInputFormat, needs updating to match hadoop 0.20.x AND remove bit where we can make < maps than regions
Date Sat, 27 Jun 2009 18:06:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724875#action_12724875
] 

Lars George commented on HBASE-1385:
------------------------------------

So it begins. I started porting the mapreduce package to the proper Hadoop 0.20 API. I will
upload a patch here soon, but I noticed an issue where Partitioner is not a JobConfigurable
anymore (as those where removed). Now there is no way of creating a table in the HRegionPartitioner
anymore. I checked and they had a TotalOrderPartitioner that also relied on the configure()
call - but that class is missing from the Hadoop mapreduce package. So not sure how this is
supposed to work.

Further there are no interfaces anymore and everything is simply "extended". For example

{code}
public abstract class TableReduce<KEYIN extends WritableComparable<KEYIN>, VALUEIN
extends Writable>
extends Reducer<KEYIN, VALUEIN, ImmutableBytesWritable, Put> {
{code}

I added "abstract" because Reducer is too - but does not have any abstract function calls,
so I could leave it out. But I assume that we also want to force people to extend this class
(or use IndentyTableReduce). Right?

You can also see I am using the KEYIN, VALUEIN parameter names as Hadoop as changed to the
same, away from K1, V1, K2, V2 which is/was confusing.

Lastly, why do we add the specific types into the TableReduce parameters, i.e. instead of

{code}
public abstract class TableReduce<KEYIN, VALUEIN> ...
{code}

which is what is in the base class

{code}
public abstract class Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT> ...
{code}

I also changed the class fields/members in TableSplit to have no "m_" prefix to be the same
as in all the other classes.

Finally, what do you think of renaming the classes to TableReducer and TableMapper? I do not
think it matter that much, but asking for opinions here.

Oh, and I am cleaning up the JavaDocs for the whole package as that is all over the place,
so the patch is rather a lot of lines in end. 

Comments?


> Revamp TableInputFormat, needs updating to match hadoop 0.20.x AND remove bit where we
can make < maps than regions
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1385
>                 URL: https://issues.apache.org/jira/browse/HBASE-1385
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.21.0
>
>         Attachments: mr.patch
>
>
> Update TIF to match new MR.
> Remove the bit of logic where we will use number of configured maps as splits count rather
than regions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message