Chun Chang (JIRA)
[jira] [Commented] (DRILL-2082) nested arrays of strings returned wrong results
Date Sat, 14 Feb 2015 01:29:11 GMT

    https://issues.apache.org/jira/browse/DRILL-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321112#comment-14321112

Chun Chang commented on DRILL-2082:

Chun Chang commented on DRILL-2082:

#Fri Feb 13 15:14:59 EST 2015

0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.aaa from `complex.json` t
limit 5;
|     id     |    aaa     |
| 1          | [[["aa0 1"],["ab0 1"]],[["ba0 1"],["bb0 1"]],[["ca0 1","ca1 1"],["cb0 1","cb1
1","cb2 1"]]] |
| 2          | [[["aa0 2"],["ab0 2"]],[["ba0 2"],["bb0 2"]],[["ca0 2","ca1 2"],["cb0 2","cb1
2","cb2 2"]]] |
| 3          | [[["aa0 3"],["ab0 3"]],[["ba0 3"],["bb0 3"]],[["ca0 3","ca1 3"],["cb0 3","cb1
3","cb2 3"]]] |
| 4          | [[["aa0 4"],["ab0 4"]],[["ba0 4"],["bb0 4"]],[["ca0 4","ca1 4"],["cb0 4","cb1
4","cb2 4"]]] |
| 5          | [[["aa0 5"],["ab0 5"]],[["ba0 5"],["bb0 5"]],[["ca0 5","ca1 5"],["cb0 5","cb1
5","cb2 5"]]] |
5 rows selected (0.273 seconds)

> nested arrays of strings returned wrong results
> -----------------------------------------------
>                 Key: DRILL-2082
>                 URL: https://issues.apache.org/jira/browse/DRILL-2082
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 0.8.0
>            Reporter: Chun Chang
>            Assignee: Mehant Baid
>            Priority: Critical
>             Fix For: 0.8.0
> #Mon Jan 26 14:10:51 PST 2015
> git.commit.id.abbrev=3c6d0ef
> Querying Complex JSON data type nested array of strings returned wrong results when data
size is large (1 million row). Smaller data size (a few rows) returned correct results. Test
data can be accessed at http://apache-drill.s3.amazonaws.com/files/complex.json.gz
> For small data size, I got correct results:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.aaa from `aaa.json`
> +------------+------------+
> |     id     |    aaa     |
> +------------+------------+
> | 1          | [[["aa0 1"],["ab0 1"]],[["ba0 1"],["bb0 1"]],[["ca0 1","ca1 1"],["cb0
1","cb1 1","cb2 1"]]] |
> | 2          | [[["aa0 2"],["ab0 2"]],[["ba0 2"],["bb0 2"]],[["ca0 2","ca1 2"],["cb0
2","cb1 2","cb2 2"]]] |
> +------------+------------+
> {code}
> But large data size returned wrong results:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.aaa from `complex.json`
t where t.id=1 limit 1;
> +------------+------------+
> |     id     |    aaa     |
> +------------+------------+
> | 1          | [[["ba0 56"],["bb0 56"],["ca0 56","ca1 56"],["cb0 56","cb1 56","cb2 56"],["aa0
91"],["ab0 91"],["aa0 125"],["ab0 125"],["aa0 140"],["ab0 140"],["aa0 142"],["ab0 142"],["aa0
146"],["ab0 146"],["ba0 402"],["bb0 402"],["ca0 402","ca1 402"],["cb0 402","cb1 402","cb2
402"],["aa0 403"],["ab0 403"],["ba0 403"],["bb0 403"],["ca0 403","ca1 403"],["cb0 403","cb1
403","cb2 403"],["aa0 404"],["ab0 404"],["ba0 404"],["bb0 404"],["ca0 404","ca1 404"],["cb0
404","cb1 404","cb2 404"],["aa0 405"],["ab0 405"],["ba0 405"],["bb0 405"],["ca0 405","ca1
405"],["cb0 405","cb1 405","cb2 405"],["aa0 437"],["ab0 437"],["aa0 485"],["ab0 485"],["aa0
503"],["ab0 503"],["aa0 569"],["ab0 569"],["aa0 581"],["ab0 581"],["aa0 620"],["ab0 620"],["aa0
632"],["ab0 632"],["aa0 640"],["ab0 640"],["aa0 650"],["ab0 650"],["aa0 669"],["ab0 669"],["aa0
671"],["ab0 671"],["aa0 728"],["ab0 728"],["aa0 735"],["ab0 735"],["aa0 772"],["ab0 772"],["aa0
784"],["ab0 784"],["aa0 811"],["ab0 811"],["aa0 817"],["ab0 817"],["aa0 836"],["ab0 836"],["aa0
881"],["ab0 881"],["aa0 891"],["ab0 891"],["aa0 924"],["ab0 924"],["aa0 1005"],["ab0 1005"],["aa0
1057"],["ab0 1057"],["aa0 1086"],["ab0 1086"],["aa0 1089"],["ab0 1089"],["aa0 1097"],["ab0
1097"],["aa0 1133"],["ab0 1133"],["aa0 1136"],["ab0 1136"],["aa0 1146"],["ab0 1146"],["aa0
1169"],["ab0 1169"],["aa0 1178"],["ab0 1178"],["aa0 1184"],["ab0 1184"],["aa0 1189"],["ab0
1189"],["aa0 1223"],["ab0 1223"],["aa0 1275"],["ab0 1275"],["aa0 1290"],["ab0 1290"],["aa0
1295"],["ab0 1295"],["aa0 1320"],["ab0 1320"],["aa0 1343"],["ab0 1343"],["aa0 1400"],["ab0
1400"],["aa0 1426"],["ab0 1426"],["aa0 1442"],["ab0 1442"],["aa0 1455"],["ab0 1455"],["aa0
1499"],["ab0 1499"],["aa0 1521"],["ab0 1521"],["aa0 1541"],["ab0 1541"],["aa0 1557"],["ab0
1557"],["aa0 1578"],["ab0 1578"],["aa0 1633"],["ab0 1633"],["aa0 1635"],["ab0 1635"],["aa0
1651"],["ab0 1651"],["aa0 1665"],["ab0 1665"],["aa0 1689"],["ab0 1689"],["aa0 1760"],["ab0
1760"],["aa0 1784"],["ab0 1784"],["aa0 1796"],["ab0 1796"],["aa0 1801"],["ab0 1801"],["aa0
1817"],["ab0 1817"],["aa0 1861"],["ab0 1861"],["aa0 1872"],["ab0 1872"],["aa0 1895"],["ab0
1895"],["aa0 1897"],["ab0 1897"],["aa0 1911"],["ab0 1911"],["aa0 1975"],["ab0 1975"],["aa0
1983"],["ab0 1983"],["aa0 1996"],["ab0 1996"],["aa0 2005"],["ab0 2005"],["aa0 2048"],["ab0
2048"],["aa0 2063"],["ab0 2063"],["aa0 2150"],["ab0 2150"],["aa0 2159"],["ab0 2159"],["aa0
2214"],["ab0 2214"],["aa0 2218"],["ab0 2218"],["aa0 2220"],["ab0 2220"],["aa0 2250"],["ab0
2250"],["aa0 2256"],["ab0 2256"],["aa0 2265"],["ab0 2265"],["aa0 2296"],["ab0 2296"],["aa0
2319"],["ab0 2319"],["aa0 2327"],["ab0 2327"],["aa0 2333"],["ab0 2333"],["aa0 2361"],["ab0
2361"],["aa0 2392"],["ab0 2392"],["aa0 2399"],["ab0 2399"],["aa0 2424"],["ab0 2424"],["aa0
2466"],["ab0 2466"],["aa0 2473"],["ab0 2473"],["aa0 2508"],["ab0 2508"],["aa0 2524"],["ab0
2524"],["aa0 2550"],["ab0 2550"],["aa0 2553"],["ab0 2553"],["aa0 2560"],["ab0 2560"],["aa0
2563"],["ab0 2563"],["aa0 2574"],["ab0 2574"],["aa0 2592"],["ab0 2592"],["aa0 2600"],["ab0
2600"],["aa0 2606"],["ab0 2606"],["aa0 2639"],["ab0 2639"],["aa0 2670"],["ab0 2670"],["aa0
2684"],["ab0 2684"],["aa0 2720"],["ab0 2720"],["aa0 2745"],["ab0 2745"],["aa0 2763"],["ab0
2763"],["aa0 2786"],["ab0 2786"],["aa0 2831"],["ab0 2831"],["aa0 2834"],["ab0 2834"],["aa0
2838"],["ab0 2838"],["aa0 2842"],["ab0 2842"],["aa0 2909"],["ab0 2909"],["aa0 2982"],["ab0
2982"],["aa0 2989"],["ab0 2989"],["aa0 2992"],["ab0 2992"],["aa0 3027"],["ab0 3027"],["aa0
3033"],["ab0 3033"],["aa0 3052"],["ab0 3052"],["aa0 3072"],["ab0 3072"],["aa0 3078"],["ab0
3078"],["aa0 3104"],["ab0 3104"],["aa0 3116"],["ab0 3116"],["aa0 3152"],["ab0 3152"],["aa0
3168"],["ab0 3168"],["aa0 3195"],["ab0 3195"],["aa0 3202"],["ab0 3202"],["aa0 3212"],["ab0
3212"],["aa0 3227"],["ab0 3227"],["aa0 3252"],["ab0 3252"],["aa0 3258"],["ab0 3258"],["aa0
3269"],["ab0 3269"],["aa0 3308"],["ab0 3308"],["aa0 3332"],["ab0 3332"],["aa0 3351"],["ab0
3351"],["aa0 3359"],["ab0 3359"],["aa0 3382"],["ab0 3382"],["aa0 3400"],["ab0 3400"],["aa0
3450"],["ab0 3450"],["aa0 3455"],["ab0 3455"],["aa0 3478"],["ab0 3478"],["aa0 3484"],["ab0
3484"],["aa0 3504"],["ab0 3504"],["aa0 3531"],["ab0 3531"],["aa0 3557"],["ab0 3557"],["aa0
3582"],["ab0 3582"],["aa0 3631"],["ab0 3631"],["aa0 3658"],["ab0 3658"],["aa0 3703"],["ab0
3703"],["aa0 3710"],["ab0 3710"],["aa0 3716"],["ab0 3716"],["aa0 3741"],["ab0 3741"],["aa0
3759"],["ab0 3759"],["aa0 3803"],["ab0 3803"],["aa0 3852"],["ab0 3852"],["aa0 3874"],["ab0
3874"],["aa0 3884"],["ab0 3884"],["aa0 3887"],["ab0 3887"],["aa0 3889"],["ab0 3889"],["aa0
3981"],["ab0 3981"],["aa0 3993"],["ab0 3993"],["aa0 4012"],["ab0 4012"],["aa0 4024"],["ab0
4024"],["aa0 4032"],["ab0 4032"],["aa0 4042"],["ab0 4042"],["aa0 4066"],["ab0 4066"],["aa0
4088"],["ab0 4088"],["aa0 4095"],["ab0 4095"]],[[""],["bb0 3741"],["ba0 3759"],["bb0 3759"],["ba0
3803"],["bb0 3803"],["ba0 3814"],["bb0 3814"],["ba0 3852"],["bb0 3852"],["ba0 3874"],["bb0
3874"],["ba0 3884"],["bb0 3884"],["ba0 3887"],["bb0 3887"],["ba0 3889"],["bb0 3889"],["ba0
3957"],["bb0 3957"],["ba0 3981"],["bb0 3981"],["ba0 3993"],["bb0 3993"],["ba0 4012"],["bb0
4012"],["ba0 4024"],["bb0 4024"],["ba0 4032"],["bb0 4032"],["ba0 4042"],["bb0 4042"],["ba0
4066"],["bb0 4066"],["ba0 4088"],["bb0 4088"],["ba0 4095"],["bb0 4095"]],[["ca0 4095","ca1
4095"],["cb0 4095","cb1 4095","cb2 4095"]]] |
> +------------+------------+
> {code}
> physical plan
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select t.id, t.aaa
from `complex.json` t where t.id=1 limit 1;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(id=[$0], aaa=[$1])
> 00-02        SelectionVectorRemover
> 00-03          Limit(fetch=[1])
> 00-04            Filter(condition=[=($0, 1)])
> 00-05              Project(id=[$1], aaa=[$0])
> 00-06                Scan(groupscan=[EasyGroupScan [selectionRoot=/drill/testdata/complex_type/json/complex.json,
numFiles=1, columns=[`id`, `aaa`], files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
> {code}

