Return-Path: X-Original-To: apmail-sqoop-dev-archive@www.apache.org Delivered-To: apmail-sqoop-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 370FD10DE4 for ; Mon, 1 Dec 2014 19:33:24 +0000 (UTC) Received: (qmail 99318 invoked by uid 500); 1 Dec 2014 19:33:23 -0000 Delivered-To: apmail-sqoop-dev-archive@sqoop.apache.org Received: (qmail 99281 invoked by uid 500); 1 Dec 2014 19:33:23 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 99261 invoked by uid 99); 1 Dec 2014 19:33:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Dec 2014 19:33:23 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vbasavaraj@cloudera.com designates 209.85.192.50 as permitted sender) Received: from [209.85.192.50] (HELO mail-qg0-f50.google.com) (209.85.192.50) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Dec 2014 19:33:19 +0000 Received: by mail-qg0-f50.google.com with SMTP id i50so7982671qgf.23 for ; Mon, 01 Dec 2014 11:32:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=c5nT7/SW10lf/0XXqTFMqV0TLi5ZqJcmfDuM+9EnR74=; b=ZhGaIwridYE29/8TtnBAoe8B7ed/Ms8jrBL9N1NtLri/2S99pzorWOizuBYqWCjoSb /vTxlDeL2eInrKLFhg9MYSOuNp8hKBb9PEb3LfJnUiqc7vWFheElYnIFPCZ8Ap2U7qGt cq6VwuttSMPGPr4Hr0AkjR53Lc2WhJQ3Nn82UnFuLZIK69+sk8tS/IoYVb2GjZpoh4t+ eiRyQet9DA0tBmmeLxTC+fMnuCYQyUkCjYw8jANPrvxbkDcqKN6U0nN0662IR1tWCCqt 16QXMIoIBbfMPx8XXg1sZq56KeTMKvSc73N9+NMinXKGXAFhj6ZbYVRVMpVwcEx9w2o8 8fDA== X-Gm-Message-State: ALoCoQmPhDs2gNFMzCwOIFabAIwl3TFOGc3zE6JgD2ww/ou+k90pxZVYaAGw3xu48W2kJPK3ztsh X-Received: by 10.140.100.228 with SMTP id s91mr28963358qge.31.1417462378386; Mon, 01 Dec 2014 11:32:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.32.228 with HTTP; Mon, 1 Dec 2014 11:32:37 -0800 (PST) In-Reply-To: References: <06CC5BAA-608C-41D8-8BA2-3C5F7710B849@apache.org> From: Veena Basavaraj Date: Mon, 1 Dec 2014 11:32:37 -0800 Message-ID: Subject: Re: Configurable NULL in IDF or Connector? To: dev@sqoop.apache.org Content-Type: multipart/alternative; boundary=001a11c16f708bee8605092cadf3 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c16f708bee8605092cadf3 Content-Type: text/plain; charset=UTF-8 +1 Gwen. Lets have a separate ticket for #2, since this should be part of the Sqoop guidelines, esp for the CSV String Best, *./Vee* On Mon, Dec 1, 2014 at 11:26 AM, Abraham Elmahrek wrote: > Indeed. I created SQOOP-1678 is intended to address #1. Let me re-define > it... > > Also, for #2... There are a few ways of generating output. It seems NULL > values range from "\N" to 0x0 to "NULL". I think keeping NULL makes sense. > > On Mon, Dec 1, 2014 at 10:58 AM, Jarek Jarcec Cecho > wrote: > > > I do share the same point of view as Gwen. The CSV format for UDF is very > > strict so that we have minimal surface area for inconsistencies between > > multiple connectors. This is because the IDF is an agreed upon exchange > > format when transferring data from one connector to the other. That > however > > shouldn't stop one connector (such as HDFS) to offer ways to save the > > resulting CSV differently. > > > > We had similar discussion about separator and quote characters in > > SQOOP-1522 that seems to be relevant to the NULL discussion here. > > > > Jarcec > > > > > On Dec 1, 2014, at 10:42 AM, Gwen Shapira > wrote: > > > > > > I think its two different things: > > > > > > 1. HDFS connector should give more control over the formatting of the > > > data in text files (nulls, escaping, etc) > > > 2. IDF should give NULLs in a format that is optimized for > > > MySQL/Postgres direct connectors (since thats one of the IDF design > > > goals). > > > > > > Gwen > > > > > > On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek > > wrote: > > >> Hey guys, > > >> > > >> Any thoughts on where configurable NULL values should be? Either the > > IDF or > > >> HDFS connector? > > >> > > >> cf: https://issues.apache.org/jira/browse/SQOOP-1678 > > >> > > >> -Abe > > > > > --001a11c16f708bee8605092cadf3--