Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 611B2200B82 for ; Tue, 9 Aug 2016 02:38:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5FBB2160A91; Tue, 9 Aug 2016 00:38:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 95C42160AB4 for ; Tue, 9 Aug 2016 02:38:21 +0200 (CEST) Received: (qmail 36537 invoked by uid 500); 9 Aug 2016 00:38:20 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 36373 invoked by uid 99); 9 Aug 2016 00:38:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Aug 2016 00:38:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A1F7E2C0D61 for ; Tue, 9 Aug 2016 00:38:20 +0000 (UTC) Date: Tue, 9 Aug 2016 00:38:20 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-4704) select statement behavior is inconsistent for decimal values in parquet MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 09 Aug 2016 00:38:22 -0000 [ https://issues.apache.org/jira/browse/DRILL-4704?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1541= 2714#comment-15412714 ]=20 ASF GitHub Bot commented on DRILL-4704: --------------------------------------- Github user daveoshinsky commented on a diff in the pull request: https://github.com/apache/drill/pull/517#discussion_r73981580 =20 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/Expre= ssionTreeMaterializer.java --- @@ -267,6 +267,17 @@ public LogicalExpression visitBooleanOperator(Bool= eanOperator op, FunctionLookup return new BooleanOperator(op.getName(), args, op.getPosition())= ; } =20 + static public int computePrecision(LogicalExpression currentArg) { + int precision =3D currentArg.getMajorType().getPrecision(); --- End diff -- =20 Well, if the type has no precision, what do you suggest? =C2=A0Move the= fix back to CastIntDecimal.java (calculate the precision based on the valu= e), as I originally had it, but Jinfeng insisted was incorrect? You two fellows, Jinfeng and Aman, can decide among yourselves. =C2=A0L= et me know when you're finished duking it out. =C2=A0If there are any survi= vors, we can discuss on the hangout tomorrow.=20 =20 On Monday, August 8, 2016 8:04 PM, Aman Sinha wrote: =20 =20 In exec/java-exec/src/main/java/org/apache/drill/exec/expr/ExpressionT= reeMaterializer.java:> @@ -267,6 +267,17 @@ public LogicalExpression visitB= ooleanOperator(BooleanOperator op, FunctionLookup > return new BooleanOperator(op.getName(), args, op.getPosition(= )); > } > =20 > + static public int computePrecision(LogicalExpression currentArg)= { > + int precision =3D currentArg.getMajorType().getPrecision(); Did you consider checking whether the type has a precision or not using= getMajorType().hasPrecision() ? That way, you would only call getPrecision= () if it returned True and otherwise set the precision for INT, BIGINT. =E2= =80=94 You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread. = =20 =20 =20 > select statement behavior is inconsistent for decimal values in parquet > ----------------------------------------------------------------------- > > Key: DRILL-4704 > URL: https://issues.apache.org/jira/browse/DRILL-4704 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill > Affects Versions: 1.6.0 > Environment: Windows 7 Pro, Java 1.8.0_91 > Reporter: Dave Oshinsky > Fix For: Future > > > A select statement that searches a parquet file for a decimal value match= ing a specific value behaves inconsistently. The query expressed most simp= ly finds nothing: > 0: jdbc:drill:zk=3Dlocal> select * from dfs.`c:/archiveHR/HR.EMPLOYEES` w= here employee_id =3D 100; > +--------------+-------------+------------+--------+---------------+-----= ------+ > | EMPLOYEE_ID | FIRST_NAME | LAST_NAME | EMAIL | PHONE_NUMBER | HIRE= _DATE | > +--------------+-------------+------------+--------+---------------+-----= ------+ > +--------------+-------------+------------+--------+---------------+-----= ------+ > No rows selected (0.348 seconds) > The query can be modified to find the matching row in a few ways, such as= the following (using between instead of '=3D', changing 100 to 100.0, or c= asting as decimal: > 0: jdbc:drill:zk=3Dlocal> select * from dfs.`c:/archiveHR/HR.EMPLOYEES` w= here employee_id between 100 and 100; > +--------------+-------------+------------+--------+---------------+-----= ------+ > | EMPLOYEE_ID | FIRST_NAME | LAST_NAME | EMAIL | PHONE_NUMBER | = HIR | > +--------------+-------------+------------+--------+---------------+-----= ------+ > | 100 | Steven | King | SKING | 515.123.4567 | 2003= -06-1 | > +--------------+-------------+------------+--------+---------------+-----= ------+ > 1 row selected (0.226 seconds) > 0: jdbc:drill:zk=3Dlocal> select * from dfs.`c:/archiveHR/HR.EMPLOYEES` w= here employee_id =3D 100.0; > +--------------+-------------+------------+--------+---------------+-----= ------+ > | EMPLOYEE_ID | FIRST_NAME | LAST_NAME | EMAIL | PHONE_NUMBER | = HIR | > +--------------+-------------+------------+--------+---------------+-----= ------+ > | 100 | Steven | King | SKING | 515.123.4567 | 2003= -06-1 | > +--------------+-------------+------------+--------+---------------+-----= ------+ > 1 row selected (0.259 seconds) > 0: jdbc:drill:zk=3Dlocal> select * from dfs.`c:/archiveHR/HR.EMPLOYEES` w= here cast(employee_id AS DECIMAL) =3D 100; > +--------------+-------------+------------+--------+---------------+-----= ------+ > | EMPLOYEE_ID | FIRST_NAME | LAST_NAME | EMAIL | PHONE_NUMBER | = HIR | > +--------------+-------------+------------+--------+---------------+-----= ------+ > | 100 | Steven | King | SKING | 515.123.4567 | 2003= -06-1 | > +--------------+-------------+------------+--------+---------------+-----= ------+ > 1 row selected (0.232 seconds) > 0: jdbc:drill:zk=3Dlocal> > The schema of the parquet data that is being searched is as follows: > $ java -jar parquet-tools*1.jar meta c:/archiveHR/HR.EMPLOYEES/1.parquet > file: file:/c:/archiveHR/HR.EMPLOYEES/1.parquet > creator: parquet-mr version 1.8.1 (build 4aba4dae7bb0d4edbcf7923ae= 1339f28fd3f7fcf) > ..... > file schema: HR.EMPLOYEES > -------------------------------------------------------------------------= ------- > EMPLOYEE_ID: REQUIRED FIXED_LEN_BYTE_ARRAY O:DECIMAL R:0 D:0 > FIRST_NAME: OPTIONAL BINARY O:UTF8 R:0 D:1 > LAST_NAME: REQUIRED BINARY O:UTF8 R:0 D:0 > EMAIL: REQUIRED BINARY O:UTF8 R:0 D:0 > PHONE_NUMBER: OPTIONAL BINARY O:UTF8 R:0 D:1 > HIRE_DATE: REQUIRED BINARY O:UTF8 R:0 D:0 > JOB_ID: REQUIRED BINARY O:UTF8 R:0 D:0 > SALARY: OPTIONAL FIXED_LEN_BYTE_ARRAY O:DECIMAL R:0 D:1 > COMMISSION_PCT: OPTIONAL FIXED_LEN_BYTE_ARRAY O:DECIMAL R:0 D:1 > MANAGER_ID: OPTIONAL FIXED_LEN_BYTE_ARRAY O:DECIMAL R:0 D:1 > DEPARTMENT_ID: OPTIONAL FIXED_LEN_BYTE_ARRAY O:DECIMAL R:0 D:1 > row group 1: RC:107 TS:9943 OFFSET:4 > -------------------------------------------------------------------------= ------- > EMPLOYEE_ID: FIXED_LEN_BYTE_ARRAY SNAPPY DO:0 FPO:4 SZ:360/355/0.99 V= C:107 ENC:PLAIN,BIT_PACKED > FIRST_NAME: BINARY SNAPPY DO:0 FPO:364 SZ:902/1058/1.17 VC:107 ENC:P= LAIN_DICTIONARY,RLE,BIT_PACKED > LAST_NAME: BINARY SNAPPY DO:0 FPO:1266 SZ:913/1111/1.22 VC:107 ENC:= PLAIN,BIT_PACKED > EMAIL: BINARY SNAPPY DO:0 FPO:2179 SZ:977/1184/1.21 VC:107 ENC:= PLAIN,BIT_PACKED > PHONE_NUMBER: BINARY SNAPPY DO:0 FPO:3156 SZ:750/1987/2.65 VC:107 ENC:= PLAIN,RLE,BIT_PACKED > HIRE_DATE: BINARY SNAPPY DO:0 FPO:3906 SZ:874/2636/3.02 VC:107 ENC:= PLAIN_DICTIONARY,BIT_PACKED > JOB_ID: BINARY SNAPPY DO:0 FPO:4780 SZ:254/302/1.19 VC:107 ENC:P= LAIN_DICTIONARY,BIT_PACKED > SALARY: FIXED_LEN_BYTE_ARRAY SNAPPY DO:0 FPO:5034 SZ:419/580/1.3= 8 VC:107 ENC:PLAIN,RLE,BIT_PACKED > COMMISSION_PCT: FIXED_LEN_BYTE_ARRAY SNAPPY DO:0 FPO:5453 SZ:97/113/1.16= VC:107 ENC:PLAIN,RLE,BIT_PACKED > MANAGER_ID: FIXED_LEN_BYTE_ARRAY SNAPPY DO:0 FPO:5550 SZ:168/363/2.1= 6 VC:107 ENC:PLAIN,RLE,BIT_PACKED > DEPARTMENT_ID: FIXED_LEN_BYTE_ARRAY SNAPPY DO:0 FPO:5718 SZ:94/254/2.70= VC:107 ENC:PLAIN,RLE,BIT_PACKED -- This message was sent by Atlassian JIRA (v6.3.4#6332)