drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1672) hbase string comparison handled wrong
Date Tue, 11 Nov 2014 01:48:34 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205785#comment-14205785
] 

Aman Sinha commented on DRILL-1672:
-----------------------------------

I believe that the fix for DRILL-1631 should fix this issue.  Although I haven't checked agains
Hbase, the issue is manifest with other data sources.  To illustrate, here's the extract from
the Explain plan before and after 1631 fix:  (note that no CAST is added in the second case):

{code:sql} 
explain plan for select count(*) from cp.`tpch/nation.parquet` where n_name = 'IR';
Filter node (Before): 
      Filter(condition=[=(CAST($0):CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary",
'IR')])
Filter node (After):
      Filter(condition=[=($0, 'IR')])
{code}

> hbase string comparison handled wrong
> -------------------------------------
>
>                 Key: DRILL-1672
>                 URL: https://issues.apache.org/jira/browse/DRILL-1672
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - HBase
>    Affects Versions: 0.7.0
>            Reporter: Chun Chang
>            Priority: Blocker
>
> #Tue Nov 04 16:58:08 UTC 2014
> git.commit.id.abbrev=129cb9c
> String is not compared properly:
> 0: jdbc:drill:schema=hbase> select cast(s.row_key as varchar(20)), cast(s.onecf.name
as varchar(30)) name from student s where s.row_key = '10';
> +------------+------------+
> |   EXPR$0   |    name    |
> +------------+------------+
> | 10         | victor nixon |
> | 100        | bob van buren |
> | 1000       | ulysses young |
> | 101        | nick carson |
> | 102        | ethan ovid |
> | 103        | katie thompson |
> | 104        | luke polk  |
> | 105        | ulysses davidson |
> | 106        | bob ovid   |
> | 107        | katie robinson |
> | 108        | sarah laertes |
> | 109        | priscilla xylophone |
> +------------+------------+
> Here is the plan:
> 0: jdbc:drill:schema=hbase> explain plan for select cast(s.row_key as varchar(20)),
cast(s.onecf.name as varchar(30)) name from student s where s.row_key = '10';
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(EXPR$0=[CAST($0):VARCHAR(20) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"
NOT NULL], name=[CAST($1):VARCHAR(30) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"])
> 00-02        SelectionVectorRemover
> 00-03          Filter(condition=[=(CAST($0):CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE
"ISO-8859-1$en_US$primary" NOT NULL, '10')])
> 00-04            Project(row_key=[$0], ITEM=[ITEM($1, 'name')])
> 00-05              Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=student,
startRow=null, stopRow=null, filter=null], columns=[SchemaPath [`row_key`], SchemaPath [`onecf`.`name`]]]])
>  | {
>   "head" : {
>     "version" : 1,
>     "generator" : {
>       "type" : "ExplainHandler",
>       "info" : ""
>     },
>     "type" : "APACHE_DRILL_PHYSICAL",
>     "options" : [ ],
>     "queue" : 0,
>     "resultMode" : "EXEC"
>   },
>   "graph" : [ {
>     "pop" : "hbase-scan",
>     "@id" : 5,
>     "hbaseScanSpec" : {
>       "tableName" : "student",
>       "startRow" : "",
>       "stopRow" : "",
>       "serializedFilter" : null
>     },
>     "storage" : {
>       "type" : "hbase",
>       "config" : {
>         "hbase.zookeeper.quorum" : "10.10.100.113,10.10.100.114,10.10.100.115",
>         "hbase.zookeeper.property.clientPort" : "5181"
>       },
>       "size.calculator.enabled" : false,
>       "enabled" : true
>     },
>     "columns" : [ "`row_key`", "`onecf`.`name`" ],
>     "cost" : 1048576.0
>   }, {
>     "pop" : "project",
>     "@id" : 4,
>     "exprs" : [ {
>       "ref" : "`row_key`",
>       "expr" : "`row_key`"
>     }, {
>       "ref" : "`ITEM`",
>       "expr" : "`onecf`.`name`"
>     } ],
>     "child" : 5,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 1048576.0
>   }, {
>     "pop" : "filter",
>     "@id" : 3,
>     "child" : 4,
>     "expr" : "equal(cast( (`row_key` ) as VARCHAR(2) ), '10') ",
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 157286.4
>   }, {
>     "pop" : "selection-vector-remover",
>     "@id" : 2,
>     "child" : 3,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 157286.4
>   }, {
>     "pop" : "project",
>     "@id" : 1,
>     "exprs" : [ {
>       "ref" : "`EXPR$0`",
>       "expr" : "cast( (`row_key` ) as VARCHAR(20) )"
>     }, {
>       "ref" : "`name`",
>       "expr" : "cast( (`ITEM` ) as VARCHAR(30) )"
>     } ],
>     "child" : 2,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 157286.4
>   }, {
>     "pop" : "screen",
>     "@id" : 0,
>     "child" : 1,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 157286.4
>   } ]
> } |
> +------------+------------+
> The filter item is casted as char(2):
> Filter(condition=[=(CAST($0):CHAR(2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message