Return-Path: X-Original-To: apmail-phoenix-dev-archive@minotaur.apache.org Delivered-To: apmail-phoenix-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 60E2D175D9 for ; Thu, 26 Mar 2015 19:29:19 +0000 (UTC) Received: (qmail 58751 invoked by uid 500); 26 Mar 2015 19:29:14 -0000 Delivered-To: apmail-phoenix-dev-archive@phoenix.apache.org Received: (qmail 58694 invoked by uid 500); 26 Mar 2015 19:29:14 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 58683 invoked by uid 99); 26 Mar 2015 19:29:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2015 19:29:14 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 26 Mar 2015 19:29:13 +0000 Received: (qmail 56423 invoked by uid 99); 26 Mar 2015 19:28:53 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2015 19:28:53 +0000 Date: Thu, 26 Mar 2015 19:28:53 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-1779) Parallelize fetching of next batch of records for scans corresponding to queries with no order by MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/PHOENIX-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382491#comment-14382491 ] James Taylor commented on PHOENIX-1779: --------------------------------------- Awesome, [~samarthjain]. Apart from the test failures, does it function correctly? Would be interesting to use our performance.py script to get a rough idea about perf. > Parallelize fetching of next batch of records for scans corresponding to queries with no order by > -------------------------------------------------------------------------------------------------- > > Key: PHOENIX-1779 > URL: https://issues.apache.org/jira/browse/PHOENIX-1779 > Project: Phoenix > Issue Type: Improvement > Reporter: Samarth Jain > Assignee: Samarth Jain > Attachments: wip.patch > > > Today in Phoenix we parallelize the first execution of scans i.e. we load only the first batch of records up to the scan's cache size in parallel. Loading of subsequent batches of records in scanners is essentially serial. This could be improved especially for queries, including the ones with no order by clauses, that do not need any kind of merge sort on the client. This could also potentially improve the performance of UPSERT SELECT statements that load data from one table and insert into another. One such use case being creating immutable indexes for tables that already have data. It could also potentially improve the performance of our MapReduce solution for bulk loading data by improving the speed of the loading/mapping phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)