drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5851) Empty table during a join operation with a non empty table produces cast exception
Date Sun, 24 Dec 2017 05:42:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16302680#comment-16302680
] 

ASF GitHub Bot commented on DRILL-5851:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1059#discussion_r158593785
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestHashJoinAdvanced.java
---
    @@ -160,4 +166,51 @@ public void testJoinWithMapAndDotField() throws Exception {
           .baselineValues("1", "2", "1", null, "a")
           .go();
       }
    +
    +  private void buildFile(String fileName, String[] data, File testDir) throws IOException
{
    +    try(PrintWriter out = new PrintWriter(new FileWriter(new File(testDir, fileName))))
{
    +      for (String line : data) {
    +        out.println(line);
    +      }
    +    }
    +  }
    +
    +  @Test
    +  public void testHashLeftJoinWithEmptyTable() throws Exception {
    +    try (ClusterFixture cluster = ClusterFixture.builder(dirTestWatcher).build();
    +      ClientFixture client = cluster.clientFixture()) {
    +      File testDir = dirTestWatcher.getRootDir();
    +      buildFile("dept.json", new String[0], testDir);
    +      QueryBuilder query = client.queryBuilder().sql("select * from cp.`employee.json`
emp left outer join dfs.`dept.json` as dept on dept.manager = emp.`last_name`");
    +      assert(query.run().recordCount() == 1155);
    +    } catch (RuntimeException ex) {
    +      fail(ex.getMessage());
    +    }
    +  }
    +
    +  @Test
    +  public void testHashInnerJoinWithEmptyTable() throws Exception {
    +    try (ClusterFixture cluster = ClusterFixture.builder(dirTestWatcher).build();
    +      ClientFixture client = cluster.clientFixture()) {
    +      File testDir = dirTestWatcher.getRootDir();
    +      buildFile("dept.json", new String[0], testDir);
    +      QueryBuilder query = client.queryBuilder().sql("select * from cp.`employee.json`
emp inner join dfs.`dept.json` as dept on dept.manager = emp.`last_name`");
    +      assert(query.run().recordCount() == 0);
    +    } catch (RuntimeException ex) {
    +      fail(ex.getMessage());
    +    }
    +  }
    +
    +  @Test
    +  public void testHashRightJoinWithEmptyTable() throws Exception {
    +    try (ClusterFixture cluster = ClusterFixture.builder(dirTestWatcher).build();
    --- End diff --
    
    All three tests use the same server setup. The current code starts the server three times.
That's fine if the tests needed a different config for each. But, here it would be faster
to derive from `ClusterTest` and use the methods provided to start the server once, sharing
the same server for all tests.
    
    Upon further inspection, there is a deeper problem. Two servers are running. See note
below.


> Empty table during a join operation with a non empty table produces cast exception 
> -----------------------------------------------------------------------------------
>
>                 Key: DRILL-5851
>                 URL: https://issues.apache.org/jira/browse/DRILL-5851
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.11.0
>            Reporter: Hanumath Rao Maduri
>            Assignee: Hanumath Rao Maduri
>
> Hash Join operation on tables with one table empty and the other non empty throws an
exception 
> {code} 
> Error: SYSTEM ERROR: DrillRuntimeException: Join only supports implicit casts between
1. Numeric data
>  2. Varchar, Varbinary data 3. Date, Timestamp data Left type: VARCHAR, Right type: INT.
Add explicit casts to avoid this error
> {code}
> Here is an example query with which it is reproducible.
> {code}
> select * from cp.`sample-data/nation.parquet` nation left outer join dfs.tmp.`2.csv`
as two on two.a = nation.`N_COMMENT`;
> {code}
> the contents of 2.csv is empty (i.e not even header info).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message