sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abraham Elmahrek <...@cloudera.com>
Subject Re: Configurable NULL in IDF or Connector?
Date Mon, 01 Dec 2014 19:26:45 GMT
Indeed. I created SQOOP-1678 is intended to address #1. Let me re-define
it...

Also, for #2... There are a few ways of generating output. It seems NULL
values range from "\N" to 0x0 to "NULL". I think keeping NULL makes sense.

On Mon, Dec 1, 2014 at 10:58 AM, Jarek Jarcec Cecho <jarcec@apache.org>
wrote:

> I do share the same point of view as Gwen. The CSV format for UDF is very
> strict so that we have minimal surface area for inconsistencies between
> multiple connectors. This is because the IDF is an agreed upon exchange
> format when transferring data from one connector to the other. That however
> shouldn't stop one connector (such as HDFS) to offer ways to save the
> resulting CSV differently.
>
> We had similar discussion about separator and quote characters in
> SQOOP-1522 that seems to be relevant to the NULL discussion here.
>
> Jarcec
>
> > On Dec 1, 2014, at 10:42 AM, Gwen Shapira <gshapira@cloudera.com> wrote:
> >
> > I think its two different things:
> >
> > 1. HDFS connector should give more control over the formatting of the
> > data in text files (nulls, escaping, etc)
> > 2. IDF should give NULLs in a format that is optimized for
> > MySQL/Postgres direct connectors (since thats one of the IDF design
> > goals).
> >
> > Gwen
> >
> > On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek <abe@cloudera.com>
> wrote:
> >> Hey guys,
> >>
> >> Any thoughts on where configurable NULL values should be? Either the
> IDF or
> >> HDFS connector?
> >>
> >> cf: https://issues.apache.org/jira/browse/SQOOP-1678
> >>
> >> -Abe
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message