Return-Path: X-Original-To: apmail-sqoop-dev-archive@www.apache.org Delivered-To: apmail-sqoop-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EC60B10BF6 for ; Mon, 1 Dec 2014 18:58:43 +0000 (UTC) Received: (qmail 5235 invoked by uid 500); 1 Dec 2014 18:58:43 -0000 Delivered-To: apmail-sqoop-dev-archive@sqoop.apache.org Received: (qmail 5203 invoked by uid 500); 1 Dec 2014 18:58:43 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 5178 invoked by uid 99); 1 Dec 2014 18:58:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Dec 2014 18:58:43 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jarcec@gmail.com designates 209.85.192.175 as permitted sender) Received: from [209.85.192.175] (HELO mail-pd0-f175.google.com) (209.85.192.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Dec 2014 18:58:17 +0000 Received: by mail-pd0-f175.google.com with SMTP id y10so11466431pdj.20 for ; Mon, 01 Dec 2014 10:58:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=alBrDRMpgo6X5TkVRDC9DfdCi3ip84FNrw1nyhdCMyU=; b=nNFUjxSkdHvpvV5cd75TT9kOJoXUT7mejAWZ9OpMX4Zgzv0+PcBQ1ofeGdavd/aJ3n DZf+Wn8aVZMwicBTY2ggnmr04muldjJcwKEruXC8ccdEj8gH5g4jQE/MgT3i8xDdxWe4 Wv6rO3XbxXA4Mz6zXxhZGFVJOgQtfGMFRSm17IwbhqZtbj5Dnik2VjsQatCGl0c9s3wz t+ILrkdBDtY7bhbclIZm0Br2/k/ofznf6/Uw+j2V0zDYhYsiTKD/1gGvz44oczKdgITZ PmgsnJg32iz8cQirj5JvzK8NpcqzqvsuicWN9/H7e2DLXMQaJnshS3sGeK9GEZNw6ZG0 4kng== X-Received: by 10.70.130.73 with SMTP id oc9mr77280471pdb.42.1417460296337; Mon, 01 Dec 2014 10:58:16 -0800 (PST) Received: from [172.16.2.132] ([74.217.76.11]) by mx.google.com with ESMTPSA id on1sm18279687pdb.32.2014.12.01.10.58.15 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 01 Dec 2014 10:58:15 -0800 (PST) Sender: Jarek Jarcec Cecho Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.1 \(1993\)) Subject: Re: Configurable NULL in IDF or Connector? From: Jarek Jarcec Cecho In-Reply-To: Date: Mon, 1 Dec 2014 10:58:14 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <06CC5BAA-608C-41D8-8BA2-3C5F7710B849@apache.org> References: To: dev@sqoop.apache.org X-Mailer: Apple Mail (2.1993) X-Virus-Checked: Checked by ClamAV on apache.org I do share the same point of view as Gwen. The CSV format for UDF is = very strict so that we have minimal surface area for inconsistencies = between multiple connectors. This is because the IDF is an agreed upon = exchange format when transferring data from one connector to the other. = That however shouldn't stop one connector (such as HDFS) to offer ways = to save the resulting CSV differently. We had similar discussion about separator and quote characters in = SQOOP-1522 that seems to be relevant to the NULL discussion here. Jarcec > On Dec 1, 2014, at 10:42 AM, Gwen Shapira = wrote: >=20 > I think its two different things: >=20 > 1. HDFS connector should give more control over the formatting of the > data in text files (nulls, escaping, etc) > 2. IDF should give NULLs in a format that is optimized for > MySQL/Postgres direct connectors (since thats one of the IDF design > goals). >=20 > Gwen >=20 > On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek = wrote: >> Hey guys, >>=20 >> Any thoughts on where configurable NULL values should be? Either the = IDF or >> HDFS connector? >>=20 >> cf: https://issues.apache.org/jira/browse/SQOOP-1678 >>=20 >> -Abe