hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7702) Adding filtering to Import jobs
Date Tue, 29 Jan 2013 05:13:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565084#comment-13565084

Jesse Yates commented on HBASE-7702:

bq. In the fist KeyValueImporter.map(...) you are converting the KVs twice now.

Yup, that was the point of keeping the kv reference, just missed doing that change - good

bq. Why not just catch Exception at Filter invocation rather than performing the same action
for 5 different Exceptions.

I try to avoid haphazard exception catching and just handle the specific cases, so you know
what's going on, but in this case it did seem a bit wasteful - happy to remove. Be great to
just use Java 7's multi-exception handling. 

bq. Should we mention in the help that only filterKeyValue(...) will be applied?


bq. As a general question: Should filter before we convert the KVs or after?

Good point. I was thinking that it would make sense to filter _what's going into the table_,
but it could very easily be a filter of what's coming out of the export too (and is just as
valid). I like the latter case to help avoid doing the convert if we aren't going to accept
the key anyways. What's your feeling here?
> Adding filtering to Import jobs
> -------------------------------
>                 Key: HBASE-7702
>                 URL: https://issues.apache.org/jira/browse/HBASE-7702
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>             Fix For: 0.96.0, 0.94.5
>         Attachments: hbase-7702_0.94-v0.patch, hbase-7702_trunk-v0.patch
> Add the ability to filter to the Import MapReduce job.
> Often times when restoring a table from an Export job, its not desirable to import all
the rows, but rather just a subset. This adds the abililty to just import rows to the table
that pass a given filter.
> This is the complement to HBASE-2495

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message