hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-907) Sqoop should use more intelligent splits
Date Mon, 14 Sep 2009 20:42:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755183#action_12755183
] 

Aaron Kimball commented on MAPREDUCE-907:
-----------------------------------------

1) \-\-split-by specifies a column on which to split the current table. If it's importing
all tables, then any table-specific settings are meaningless. \-\-all-tables does in fact
use JDBC metadata to infer the primary key. (As will \-\-table if \-\-split-by is not specified.)

2) Could do that.
3) The concepts seemed logically separate enough in my mind that they deserved separate variables
4) Not necessarily. I can see this run() method being overridden and then called with super.run().
Wrapping with class-by-conf seems like a bad idea, this is exactly what inheritance is for.
5) +1 to the idea, though I feel like that's a separate API patch.
6) I wholeheartedly agree. There's definitely another JIRA for "More methods that mapreduce.Job
needs", but I think that's separate work.


> Sqoop should use more intelligent splits
> ----------------------------------------
>
>                 Key: MAPREDUCE-907
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-907
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/sqoop
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-907.patch
>
>
> Sqoop should use the new split generation / InputFormat in MAPREDUCE-885

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message