From dev-return-35607-archive-asf-public=cust-asf.ponee.io@sqoop.apache.org Tue Feb 13 12:39:42 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id B067E180656 for ; Tue, 13 Feb 2018 12:39:42 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A01C5160C53; Tue, 13 Feb 2018 11:39:42 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BBB9D160C50 for ; Tue, 13 Feb 2018 12:39:41 +0100 (CET) Received: (qmail 65397 invoked by uid 500); 13 Feb 2018 11:39:40 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 65379 invoked by uid 500); 13 Feb 2018 11:39:40 -0000 Delivered-To: apmail-incubator-sqoop-dev@incubator.apache.org Received: (qmail 65375 invoked by uid 99); 13 Feb 2018 11:39:39 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Feb 2018 11:39:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 71395C1A1B; Tue, 13 Feb 2018 11:39:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.141 X-Spam-Level: ** X-Spam-Status: No, score=2.141 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id x6nXA7zNu4Qi; Tue, 13 Feb 2018 11:39:37 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 1300F5F17F; Tue, 13 Feb 2018 11:39:37 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A790FE00E9; Tue, 13 Feb 2018 11:39:35 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id 7D21EC40191; Tue, 13 Feb 2018 11:39:32 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============9135991397344135727==" MIME-Version: 1.0 Subject: Re: Review Request 65607: SQOOP-2976 Flag to expand decimal values to fit AVRO schema From: Szabolcs Vasas To: Szabolcs Vasas , Boglarka Egyed Cc: Fero Szabo , Sqoop Date: Tue, 13 Feb 2018 11:39:32 -0000 Message-ID: <20180213113932.2831.36687@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Szabolcs Vasas X-ReviewGroup: Sqoop X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/65607/ X-Sender: Szabolcs Vasas References: <20180212143141.24892.58407@reviews-vm2.apache.org> In-Reply-To: <20180212143141.24892.58407@reviews-vm2.apache.org> X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/testutil/AvroTestUtils.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/manager/oracle/OracleAvroPaddingImportTest.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/manager/hsqldb/TestHsqldbAvroPadding.java X-ReviewBoard-Diff-For: src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java Reply-To: Szabolcs Vasas X-ReviewRequest-Repository: sqoop-trunk --===============9135991397344135727== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65607/#review197269 ----------------------------------------------------------- src/java/org/apache/sqoop/avro/AvroUtil.java Lines 88 (patched) I think this if statement is redundant since if schemaContainingScale is null then the method will return null anyway. src/java/org/apache/sqoop/avro/AvroUtil.java Lines 93 (patched) Can we extract this string value to a constant? src/test/org/apache/sqoop/manager/oracle/OracleAvroPaddingImportTest.java Lines 101 (patched) Do we really need this file? It seems we do not use dates in this test case. src/test/org/apache/sqoop/manager/oracle/OracleAvroPaddingImportTest.java Lines 108 (patched) I think we can delete these lines. src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java Lines 93 (patched) A similar logic is already present in org.apache.sqoop.testutil.BaseSqoopTestCase can we somehow reuse that logic? For example it already has a ConnManager field which can be initialized in org.apache.sqoop.testutil.BaseSqoopTestCase#setUp src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java Lines 99 (patched) Do we really need this property to be set? It seems to be initialized to the default factor classes which are used by default anyway, aren't they? src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java Lines 145 (patched) Can we make these test methods names similar to the ones defined in TestHsqldbAvroPadding (testAvroImportWithoutPadding, testAvroImportWithPadding) src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java Lines 81 (patched) Could we introduce a parameter-less version of this which would set the withHadoopCommons to true? I think most of the time this would be called to set the flag to true so could simplify the code on the client side. src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java Lines 82 (patched) I think withHadoopFlags or withCommonHadoopFlags would be more descriptive name for this field. src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java Lines 87 (patched) typo: Transforms src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java Lines 102 (patched) I think this logic handling the tool options could be also moved to ArgumentUtils since it already contains similar stuff. After that this method could be simplified, the notEmpty checks could also be removed. We could also think about moving all the logic from ArgumentUtils to the builder since it would be a better practice to use the builder instead of the static methods. src/test/org/apache/sqoop/testutil/AvroTestUtils.java Lines 60 (patched) This variable does not seem to be used. src/test/org/apache/sqoop/testutil/AvroTestUtils.java Lines 64 (patched) I think instead of 'r' this reader should be closed finally. Can we use a try-with-resources statement here? src/test/org/apache/sqoop/testutil/AvroTestUtils.java Lines 67 (patched) We could convert the comment message to a parameter of fail() so it's more obvious for the test runner why the test has failed. src/test/org/apache/sqoop/testutil/AvroTestUtils.java Lines 73 (patched) Please remove System.out.println src/test/org/apache/sqoop/testutil/AvroTestUtils.java Lines 77 (patched) I think we could just let this method throw an IOException (or rethrow it in a RuntimeException) and JUnit will fail the test anyway and log the stack trace. src/test/org/apache/sqoop/testutil/AvroTestUtils.java Lines 94 (patched) This seems to be a dupicate of org.apache.sqoop.TestAvroImport#read can we resolve it somehow? src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java Lines 434 (patched) This documentation seems to be duplicated now, add the extra parameter or just delete it :) - Szabolcs Vasas On Feb. 12, 2018, 2:31 p.m., Fero Szabo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65607/ > ----------------------------------------------------------- > > (Updated Feb. 12, 2018, 2:31 p.m.) > > > Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas. > > > Bugs: SQOOP-2976 > https://issues.apache.org/jira/browse/SQOOP-2976 > > > Repository: sqoop-trunk > > > Description > ------- > > **Summary:** > Certain databases, such as SQL Server and Postgres are storing decimal values padded with 0s, should the user insert them with less digits than the given scale. > > Other databases however, such as Oracle and HSQLDB store these numbers without trailing 0s. Then, when the JDBC driver returns these as BigDecimals, they won't match the scale in the avro schema. > > Take the following SQL commands for an example: > ``` > create table salary (id int, amount number (10,5)); > insert into salary (id, amount) values (1, 10.5); > insert into salary (id, amount) values (2, 10.50); > select * from salary; > ``` > Records in an Oracle database: > 1 10.5 > 2 10.5 > > Records in SQL Server (using decimal instead of number in the create statement): > 1 10.50000 > 2 10.50000 > > **Solition:** > The fix is simply checking the scale of the returned BigDecimals against what's in the avro schema and recreates the objects in case of a mismatch. I've introduced a new property to enable this new feature, so existing behavior is not affected. > > **Concerns: ** > - trimmings can happen silently, should we rather raise an exception? Enabling trimming adds a new feature, but it also adds the possibility silently lose scale while import. The latter could be mitigated by a thorough documentation. > - The flags current name () doesn't really match the behavior, should I change it to something else? (avro.decimal_scale_harmonization.enable) > - How / where to document this new flag? > > > **Other notable changes:** > - Introduced ArgumentArrayBuilder that reuses the existing Argument class and introduces a useful builder pattern for creating commandline arguments for tests. > - Slightly modified BaseSqoopTest to fit my needs. *(However, further refactoring would be required in this class to enable better reuse. For example: the current implementation can't be used with SQL Server, because one also needs to specify the schema besides the tablename in the create and insert statements. There are also code duplications.)* > > > Diffs > ----- > > src/java/org/apache/sqoop/avro/AvroUtil.java 1aae8df2 > src/java/org/apache/sqoop/config/ConfigurationConstants.java 7a19a62c > src/java/org/apache/sqoop/mapreduce/AvroImportMapper.java a5e5bf5a > src/test/org/apache/sqoop/manager/hsqldb/TestHsqldbAvroPadding.java PRE-CREATION > src/test/org/apache/sqoop/manager/oracle/OracleAvroPaddingImportTest.java PRE-CREATION > src/test/org/apache/sqoop/manager/sqlserver/MSSQLTestUtils.java 2220b7d5 > src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java PRE-CREATION > src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java PRE-CREATION > src/test/org/apache/sqoop/testutil/AvroTestUtils.java PRE-CREATION > src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java 588f439c > > > Diff: https://reviews.apache.org/r/65607/diff/2/ > > > Testing > ------- > > See the 3 new test classes (HSQLDB, Oracle, SQL Server). > > > Thanks, > > Fero Szabo > > --===============9135991397344135727==--