Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E955F2009F3 for ; Fri, 20 May 2016 17:47:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E81C0160A2A; Fri, 20 May 2016 15:47:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3D3EB160A0E for ; Fri, 20 May 2016 17:47:14 +0200 (CEST) Received: (qmail 16581 invoked by uid 500); 20 May 2016 15:47:13 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 16530 invoked by uid 99); 20 May 2016 15:47:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2016 15:47:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 267C42C1F64 for ; Fri, 20 May 2016 15:47:13 +0000 (UTC) Date: Fri, 20 May 2016 15:47:13 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 20 May 2016 15:47:15 -0000 [ https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293579#comment-15293579 ] ASF GitHub Bot commented on DRILL-4679: --------------------------------------- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/504#discussion_r64063077 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java --- @@ -146,6 +159,27 @@ protected IterOutcome doWork() { if (next == IterOutcome.OUT_OF_MEMORY) { outOfMemory = true; return next; + } else if (next == IterOutcome.NONE) { + // since this is first batch and we already got a NONE, need to set up the schema + + //allocate vv in the allocationVectors. + for (final ValueVector v : this.allocationVectors) { --- End diff -- the allocation logic may use existing method doAlloc(). > CONVERT_FROM() json format fails if 0 rows are received from upstream operator > ------------------------------------------------------------------------------- > > Key: DRILL-4679 > URL: https://issues.apache.org/jira/browse/DRILL-4679 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Affects Versions: 1.6.0 > Reporter: Aman Sinha > Assignee: Jinfeng Ni > > CONVERT_FROM() json format fails as below if the underlying Filter produces 0 rows: > {noformat} > 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x from cp.`tpch/region.parquet` where r_regionkey = 9999; > Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch] > Fragment 0:0 > {noformat} > If the conversion is applied as UTF8 format, the same query succeeds: > {noformat} > 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x from cp.`tpch/region.parquet` where r_regionkey = 9999; > +----+ > | x | > +----+ > +----+ > No rows selected (0.241 seconds) > {noformat} > The reason for this is the special handling in the ProjectRecordBatch for JSON. The output schema is not known for this until the run time and the ComplexWriter in the Project relies on seeing the input data to determine the output schema - this could be a MapVector or ListVector etc. > If the input data has 0 rows due to a filter condition, we should at least produce a default output schema, e.g an empty MapVector ? Need to decide a good default. Note that the CONVERT_FROM(x, 'json') could occur on 2 branches of a UNION-ALL and if one input is empty while the other side is not, it may still cause incompatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)