hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-8074) Consolidate map-side features across mapreduce tools into a single place
Date Tue, 12 Mar 2013 01:07:12 GMT
Nick Dimiduk created HBASE-8074:

             Summary: Consolidate map-side features across mapreduce tools into a single place
                 Key: HBASE-8074
                 URL: https://issues.apache.org/jira/browse/HBASE-8074
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce, Usability
            Reporter: Nick Dimiduk

The mapreduce tools support a similar but divergent set of features for mapping over KeyValue
 * {{Export}} supports specifying a version-range window, application of a rowkey regex or
prefix filter, and a "raw mode" that includes delete markers.
 * {{Import}} can apply an arbitrary filter and can also apply a "transform", renaming column
families in the emitted KeyValues.
 * {{CopyTable}} allows specifying a version-range window, limiting to a fixed number of versions,
a "raw mode", and column family transformation.
 * {{WALPlayer}} supports reading a time-range.
 * {{ImportTsv}} could incorporate a number of these features, especially the filter and transform
capabilities, allowing a user to avoid implementing a custom mapper where the existing parser
is sufficient, but for a slight massage of the data.

The proposal is to create a single implementation for these features with a single configuration
interface. Ideally, such an implementation would be exposed via the common utility classes
as well (ie, IdentityTableMapper).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message