carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ravipesala <...@git.apache.org>
Subject [GitHub] incubator-carbondata issue #620: [CARBONDATA-742]Added batch sort to improve...
Date Wed, 08 Mar 2017 16:59:12 GMT
Github user ravipesala commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/620
  
    Data size -> 100 million records
    **DDL and Queries for test**
    CREATE TABLE perftesta (c1 string,c2 string,c3 string,c4 string,c5 string,c6 bigint,c7
double,c8 int,c9 double,c10 double) STORED BY 'carbondata';
    
    Q1 -> select count(*) from perftesta;
    Q2 -> SELECT c3, c4, sum(c8) FROM perftesta WHERE c1 = 'P1_24521' GROUP BY c3, c4;
    Q3 -> SELECT c2, c5, count(distinct c1), sum(c7) FROM perftesta WHERE c4="P4_4" and
c5="P5_7" and c8>4 GROUP BY c2, c5;
    Q4 -> SELECT c2, c5, count(distinct c1), sum(c7) FROM perftesta WHERE c4="P4_4" and
c5="P5_7" GROUP BY c2, c5;
    Q5 -> SELECT c4 FROM perftesta WHERE c1="P1_24521";
    Q6 -> SELECT * FROM perftesta WHERE c2="P2_43";
    Q7 -> SELECT sum(c7), sum(c8), avg(c9), max(c10) FROM perftesta;
    Q8 -> SELECT sum(c7), sum(c8), sum(9), sum(c10) FROM perftesta WHERE c2="P2_75" and
c6<5;
    Q9 -> SELECT sum(c7), sum(c8), sum(9), sum(c10) FROM perftesta WHERE c2="P2_75";
    Q10 -> SELECT count(c1),count(c2),count(c3),count(c4),count(c5),count(c6),count(c7),count(c8),count(c9),count(c10)
FROM perftesta;
    
    **With Batch Sort**
    Load with inmemory size 1GB(with unsafe sort) so batch size will be ~450MB -->   Time
: 324 seconds
    Total blocks created 14 files with each 105MB
    
    Query(first reading, second reading)
    Q1 (6.577, 3.404)
    Q2 (3.414, 1.639)
    Q3 (8.552, 7.572)
    Q4 (5.033, 3.875)
    Q5 (0.616, 0.456)
    Q6 (7.978, 7.682)
    Q7 (3.985, 2.909)
    Q8 (8.93, 8.697)
    Q9 (3.606, 3.305)
    Q10 (8.51, 8.367)
    
    **With complete sort (old flow)**
    Load with inmemory size 1GB with unsafe sort -->   Time : 430 seconds
    Total blocks created 2 files with 920MB and 560MB
    
    Query(first reading, second reading)
    Q1 (7.473,2.254)
    Q2 (2.635, 0.678)
    Q3 (11.411, 9.322)
    Q4 (4.422, 3.883)
    Q5 (0.332,0.22)
    Q6 (8.580, 8.187)
    Q7 (4.364, 3.617)
    Q8 (12.033, 12.138)
    Q9 (3.622, 3.695)
    Q10 (8.39, 8.941)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message