hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (MAPREDUCE-1449) Sqoop Documentation about --split-by column has to be unique key seems to be wrong
Date Mon, 03 May 2010 18:43:56 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aaron Kimball resolved MAPREDUCE-1449.
--------------------------------------

    Resolution: Won't Fix

Sqoop has been removed from MapReduce; issue moved to http://github.com/cloudera/sqoop/issues#issue/2

> Sqoop Documentation about --split-by column has to be unique key seems to be wrong
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1449
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1449
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/sqoop
>    Affects Versions: 0.20.1
>            Reporter: mingran wang
>
> http://archive.cloudera.com/docs/sqoo... 
> The document above shows that " To guarantee correctness of your input, you must select
an ordering column for which each row has a unique value. If duplicate values appear in the
ordering column, the results of the import are undefined, and Sqoop will not be able to detect
the error." 
> I read the source code for sqoop, it seems that the column to split by doesn't have to
be a unique key. Plus, when the primary key is a composite key, the sqoop code only takes
the first column of the composite key which in most cases is not unique key anyways. 
> I also checked the output when non-unique key is used to split, there is nothing wrong
with the result. 
> I am wondering if the document is wrong, or there is some hidden trickiness that I am
not aware of. 
> I am using sqoop 20.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message