Mailing-List: contact dev-help@drill.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@drill.apache.org
Date: Mon, 26 Jan 2015 17:54:35 +0000 (UTC)
From: "Victoria Markman (JIRA)" <jira@apache.org>
To: dev@drill.apache.org
Message-ID: <JIRA.12770134.1422294855000.174215.1422294875203@Atlassian.JIRA>
In-Reply-To: <JIRA.12770134.1422294855000@Atlassian.JIRA>
References: <JIRA.12770134.1422294855000@Atlassian.JIRA>
 <JIRA.12770134.1422294855836@arcas>
Subject: [jira] [Created] (DRILL-2069) Star is not expanded correctly in the
 query with IN clause containing subquery
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Victoria Markman created DRILL-2069:
---------------------------------------

             Summary: Star is not expanded correctly in the query with IN clause containing subquery
                 Key: DRILL-2069
                 URL: https://issues.apache.org/jira/browse/DRILL-2069
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 0.8.0
            Reporter: Victoria Markman
            Assignee: Jinfeng Ni


t1.json
{code}
{ "a1": "aa", "b1": 1 }
{ "a1": "bb", "b1": 2 }
{ "a1": "cc", "b1": 3 }
{code}

t2.json
{code}
{ "a2": "aa", "b2": 1 }
{ "a2": "bb", "b2": 2 }
{ "a2": "xx", "b2": 10 }
{code}

Star is expanded incorrectly, we should get only columns from `t1.json`
{code}
0: jdbc:drill:schema=dfs> select * from `t1.json` where a1 in (select a2 from `t2.json`);
+------------+------------+------------+------------+
|     a2     |     a1     |     b1     |    a10     |
+------------+------------+------------+------------+
| aa         | aa         | 1          | aa         |
| bb         | bb         | 2          | bb         |
+------------+------------+------------+------------+
2 rows selected (0.172 seconds)
{code}

explain plan
{code}
00-01      Project(*=[$0])
00-02        Project(*=[$0])
00-03          HashJoin(condition=[=($1, $2)], joinType=[inner])
00-05            Project(*=[$0], a1=[$1])
00-07              Scan(groupscan=[EasyGroupScan [selectionRoot=/test/t1.json, numFiles=1, columns=[`*`], files=[maprfs:/test/t1.json]]])
00-04            HashAgg(group=[{0}])
00-06              Scan(groupscan=[EasyGroupScan [selectionRoot=/test/t2.json, numFiles=1, columns=[`a2`], files=[maprfs:/test/t2.json]]])
{code}

Workaround - specify columns explicitly
{code}
0: jdbc:drill:schema=dfs> select t1.a1, t1.a1 from `t1.json` t1 where t1.a1 in (select t2.a2 from `t2.json` t2);
+------------+------------+
|     a1     |    a10     |
+------------+------------+
| aa         | aa         |
| bb         | bb         |
+------------+------------+
2 rows selected (0.24 seconds)
{code}


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)