phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mujtaba Chohan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-2943) Performance of parallel order by query is > 30X slower than serial execution
Date Tue, 31 May 2016 19:00:14 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308359#comment-15308359
] 

Mujtaba Chohan commented on PHOENIX-2943:
-----------------------------------------

[~jamestaylor] Not a regression as query results are incorrect in 4.7.0 as well PHOENIX-2942

[~samarthjain] [~ankit@apache.org]
* Performance of serial query got significantly better in current 4.8 snapshot head compared
to 4.7.0 as Ankit mentions it seem like it's getting those values from first chunk.
* Results are incorrect for *both* parallel and serial execution

{code}
select  L_DISCOUNT, L_QUANTITY from lineitem_encoded order by (l_discount,L_QUANTITY) DESC
limit 1;
+-------------+-------------+
| L_DISCOUNT  | L_QUANTITY  |
+-------------+-------------+
| 0.04        | 17          |
+-------------+-------------+

select  L_DISCOUNT, L_QUANTITY from lineitem_encoded order by (l_discount,L_QUANTITY) ASC
limit 1;
+-------------+-------------+
| L_DISCOUNT  | L_QUANTITY  |
+-------------+-------------+
| 0.04        | 17          |
+-------------+-------------+

select  L_DISCOUNT, L_QUANTITY from lineitem_encoded order by (l_discount) ASC limit 1;
+-------------+-------------+
| L_DISCOUNT  | L_QUANTITY  |
+-------------+-------------+
| 0           | 38          |
+-------------+-------------+

select  L_DISCOUNT, L_QUANTITY from lineitem_encoded order by (l_discount) DESC limit 1;
+-------------+-------------+
| L_DISCOUNT  | L_QUANTITY  |
+-------------+-------------+
| 0.1         | 8           |
+-------------+-------------+
{code}

Without stats serial query is almost as slow as running in parallel over 2 regions. Results
are incorrect with/without stats. Tested the above with both HBase 0.98.12 and 0.98.17



> Performance of parallel order by query is > 30X slower than serial execution
> ----------------------------------------------------------------------------
>
>                 Key: PHOENIX-2943
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2943
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Mujtaba Chohan
>             Fix For: 4.8.0
>
>
> {code}
> select /*+SERIAL*/  L_DISCOUNT, L_QUANTITY from lineitem_encoded order by (l_discount,L_QUANTITY)
limit 1;
> +-------------+-------------+
> | L_DISCOUNT  | L_QUANTITY  |
> +-------------+-------------+
> | 0.04        | 17          |
> +-------------+-------------+
> 1 row selected (0.129 seconds)
> select L_DISCOUNT, L_QUANTITY from lineitem_encoded order by (l_discount,L_QUANTITY)
limit 1;
> +-------------+-------------+
> | L_DISCOUNT  | L_QUANTITY  |
> +-------------+-------------+
> | 0.04        | 17          |
> +-------------+-------------+
> 1 row selected (4.63 seconds)
> explain select /*+SERIAL*/  L_DISCOUNT, L_QUANTITY from lineitem_encoded order by (l_discount,L_QUANTITY)
limit 1;
> +------------------------------------------------------------------------------------------------+
> |                                              PLAN                                 
            |
> +------------------------------------------------------------------------------------------------+
> | CLIENT 417-CHUNK 5978838 ROWS 4150009682 BYTES SERIAL 417-WAY FULL SCAN OVER LINEITEM_ENCODED
 |
> |     SERVER TOP 1 ROW SORTED BY [(L_DISCOUNT, L_QUANTITY)]                         
            |
> | CLIENT MERGE SORT                                                                 
            |
> +------------------------------------------------------------------------------------------------+
> 3 rows selected (0.016 seconds)
> 0: jdbc:phoenix:localhost> explain select  L_DISCOUNT, L_QUANTITY from lineitem_encoded
order by (l_discount,L_QUANTITY) limit 1;
> +--------------------------------------------------------------------------------------------------+
> |                                               PLAN                                
              |
> +--------------------------------------------------------------------------------------------------+
> | CLIENT 417-CHUNK 5978838 ROWS 4150009682 BYTES PARALLEL 417-WAY FULL SCAN OVER LINEITEM_ENCODED
 |
> |     SERVER TOP 1 ROW SORTED BY [(L_DISCOUNT, L_QUANTITY)]                         
              |
> | CLIENT MERGE SORT                                                                 
              |
> +--------------------------------------------------------------------------------------------------+
> 3 rows selected (0.015 seconds)
> {code}
> Profiler information on this to be added soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message