sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: Configurable NULL in IDF or Connector?
Date Mon, 01 Dec 2014 18:58:14 GMT
I do share the same point of view as Gwen. The CSV format for UDF is very strict so that we
have minimal surface area for inconsistencies between multiple connectors. This is because
the IDF is an agreed upon exchange format when transferring data from one connector to the
other. That however shouldn't stop one connector (such as HDFS) to offer ways to save the
resulting CSV differently.

We had similar discussion about separator and quote characters in SQOOP-1522 that seems to
be relevant to the NULL discussion here.

Jarcec

> On Dec 1, 2014, at 10:42 AM, Gwen Shapira <gshapira@cloudera.com> wrote:
> 
> I think its two different things:
> 
> 1. HDFS connector should give more control over the formatting of the
> data in text files (nulls, escaping, etc)
> 2. IDF should give NULLs in a format that is optimized for
> MySQL/Postgres direct connectors (since thats one of the IDF design
> goals).
> 
> Gwen
> 
> On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek <abe@cloudera.com> wrote:
>> Hey guys,
>> 
>> Any thoughts on where configurable NULL values should be? Either the IDF or
>> HDFS connector?
>> 
>> cf: https://issues.apache.org/jira/browse/SQOOP-1678
>> 
>> -Abe


Mime
View raw message