hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8011) Refactor ImportTsv
Date Thu, 07 Mar 2013 00:38:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595353#comment-13595353

Nick Dimiduk commented on HBASE-8011:

bq. typo?

I don't think so. "splits file" is generated by HFileOutputFormat, contains a list of split
points for partitioning the generated data. I want to see this dependence on an online table
go away.

bq. What if it's not null?

The only other place it's used is in {{createSubmittableJob}}. If it's null, the job is using
the provided TsvImporterMapper and the validation needs performed. When it's not null, the
user has provided their own implementation of the mapper. In that case, none of this argument
validation applies. Since ImportTsv is supposed to be user extensible, this validation is
really the responsibility of the mapper and should be moved there. I'll defer that work until
we know that a job runner should actually look like, around the same time as when Import and
ImportTsv are folded together.
> Refactor ImportTsv
> ------------------
>                 Key: HBASE-8011
>                 URL: https://issues.apache.org/jira/browse/HBASE-8011
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce, Usability
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>            Priority: Minor
>         Attachments: 0001-HBASE-8011-Refactor-ImportTsv.patch
> ImportTsv is a little goofy.
>  - It doesn't use the Tool,Configured interfaces like a mapreduce job should.
>  - It has a static HBaseAdmin field that must be initialized before the intended API
of createSubmittableJob can be invoked.
>  - TsvParser is critical to the default mapper implementation but is unavailable to user
custom mapper implementations without forcing them into the o.a.h.h.mapreduce namespace.
>  - The configuration key constants are not public.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message