hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8074) Consolidate map-side features across mapreduce tools into a single place
Date Tue, 12 Mar 2013 23:25:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600607#comment-13600607

Nick Dimiduk commented on HBASE-8074:

TableInputFormat#setConf() already provides for most of these features. Why does TableMapReduceUtil#initTableMapperJob()
override these settings by forcing inclusion of a Scan instance?
> Consolidate map-side features across mapreduce tools into a single place
> ------------------------------------------------------------------------
>                 Key: HBASE-8074
>                 URL: https://issues.apache.org/jira/browse/HBASE-8074
>             Project: HBase
>          Issue Type: Sub-task
>          Components: mapreduce, Usability
>            Reporter: Nick Dimiduk
> The mapreduce tools support a similar but divergent set of features for mapping over
KeyValue data:
>  * {{Export}} supports specifying a version-range window, application of a rowkey regex
or prefix filter, and a "raw mode" that includes delete markers.
>  * {{Import}} can apply an arbitrary filter and can also apply a "transform", renaming
column families in the emitted KeyValues.
>  * {{CopyTable}} allows specifying a version-range window, limiting to a fixed number
of versions, a "raw mode", and column family transformation.
>  * {{WALPlayer}} supports reading a time-range.
>  * {{ImportTsv}} could incorporate a number of these features, especially the filter
and transform capabilities, allowing a user to avoid implementing a custom mapper where the
existing parser is sufficient, but for a slight massage of the data.
> The proposal is to create a single implementation for these features with a single configuration
interface. Ideally, such an implementation would be exposed via the common utility classes
as well (ie, IdentityTableMapper).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message