From dev-return-35602-archive-asf-public=cust-asf.ponee.io@sqoop.apache.org Mon Feb 12 15:31:49 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 73EBD180652 for ; Mon, 12 Feb 2018 15:31:49 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 62F8C160C3F; Mon, 12 Feb 2018 14:31:49 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A95A9160C31 for ; Mon, 12 Feb 2018 15:31:48 +0100 (CET) Received: (qmail 58745 invoked by uid 500); 12 Feb 2018 14:31:47 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 58730 invoked by uid 500); 12 Feb 2018 14:31:47 -0000 Delivered-To: apmail-incubator-sqoop-dev@incubator.apache.org Received: (qmail 58726 invoked by uid 99); 12 Feb 2018 14:31:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Feb 2018 14:31:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 978A4C0096; Mon, 12 Feb 2018 14:31:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.69 X-Spam-Level: X-Spam-Status: No, score=0.69 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id AsRBWBKTidnq; Mon, 12 Feb 2018 14:31:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id D63715F3CD; Mon, 12 Feb 2018 14:31:43 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 2D811E0047; Mon, 12 Feb 2018 14:31:43 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id 38B54C40017; Mon, 12 Feb 2018 14:31:41 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============1651024655108277790==" MIME-Version: 1.0 Subject: Re: Review Request 65607: SQOOP-2976 Flag to expand decimal values to fit AVRO schema From: Fero Szabo via Review Board To: Szabolcs Vasas , Boglarka Egyed Cc: Fero Szabo , Sqoop Date: Mon, 12 Feb 2018 14:31:41 -0000 Message-ID: <20180212143141.24892.58407@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Fero Szabo X-ReviewGroup: Sqoop X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/65607/ X-Sender: Fero Szabo References: <20180212130811.24860.57035@reviews-vm2.apache.org> In-Reply-To: <20180212130811.24860.57035@reviews-vm2.apache.org> X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/testutil/AvroTestUtils.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/manager/oracle/OracleAvroPaddingImportTest.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/manager/hsqldb/TestHsqldbAvroPadding.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java Reply-To: Fero Szabo X-ReviewRequest-Repository: sqoop-trunk --===============1651024655108277790== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65607/ ----------------------------------------------------------- (Updated Feb. 12, 2018, 2:31 p.m.) Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas. Bugs: SQOOP-2976 https://issues.apache.org/jira/browse/SQOOP-2976 Repository: sqoop-trunk Description ------- **Summary:** Certain databases, such as SQL Server and Postgres are storing decimal values padded with 0s, should the user insert them with less digits than the given scale. Other databases however, such as Oracle and HSQLDB store these numbers without trailing 0s. Then, when the JDBC driver returns these as BigDecimals, they won't match the scale in the avro schema. Take the following SQL commands for an example: ``` create table salary (id int, amount number (10,5)); insert into salary (id, amount) values (1, 10.5); insert into salary (id, amount) values (2, 10.50); select * from salary; ``` Records in an Oracle database: 1 10.5 2 10.5 Records in SQL Server (using decimal instead of number in the create statement): 1 10.50000 2 10.50000 **Solition:** The fix is simply checking the scale of the returned BigDecimals against what's in the avro schema and recreates the objects in case of a mismatch. I've introduced a new property to enable this new feature, so existing behavior is not affected. **Concerns: ** - trimmings can happen silently, should we rather raise an exception? Enabling trimming adds a new feature, but it also adds the possibility silently lose scale while import. The latter could be mitigated by a thorough documentation. - The flags current name () doesn't really match the behavior, should I change it to something else? (avro.decimal_scale_harmonization.enable) - How / where to document this new flag? **Other notable changes:** - Introduced ArgumentArrayBuilder that reuses the existing Argument class and introduces a useful builder pattern for creating commandline arguments for tests. - Slightly modified BaseSqoopTest to fit my needs. *(However, further refactoring would be required in this class to enable better reuse. For example: the current implementation can't be used with SQL Server, because one also needs to specify the schema besides the tablename in the create and insert statements. There are also code duplications.)* Diffs (updated) ----- src/java/org/apache/sqoop/avro/AvroUtil.java 1aae8df2 src/java/org/apache/sqoop/config/ConfigurationConstants.java 7a19a62c src/java/org/apache/sqoop/mapreduce/AvroImportMapper.java a5e5bf5a src/test/org/apache/sqoop/manager/hsqldb/TestHsqldbAvroPadding.java PRE-CREATION src/test/org/apache/sqoop/manager/oracle/OracleAvroPaddingImportTest.java PRE-CREATION src/test/org/apache/sqoop/manager/sqlserver/MSSQLTestUtils.java 2220b7d5 src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java PRE-CREATION src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java PRE-CREATION src/test/org/apache/sqoop/testutil/AvroTestUtils.java PRE-CREATION src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java 588f439c Diff: https://reviews.apache.org/r/65607/diff/2/ Changes: https://reviews.apache.org/r/65607/diff/1-2/ Testing ------- See the 3 new test classes (HSQLDB, Oracle, SQL Server). Thanks, Fero Szabo --===============1651024655108277790==--