Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 34394200D3A for ; Wed, 1 Nov 2017 05:13:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 32D69160BF9; Wed, 1 Nov 2017 04:13:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 78BA61609EF for ; Wed, 1 Nov 2017 05:13:04 +0100 (CET) Received: (qmail 38562 invoked by uid 500); 1 Nov 2017 04:13:03 -0000 Mailing-List: contact notifications-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.apache.org Delivered-To: mailing list notifications@asterixdb.apache.org Received: (qmail 38551 invoked by uid 99); 1 Nov 2017 04:13:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Nov 2017 04:13:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id B0FF71A3C24 for ; Wed, 1 Nov 2017 04:13:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 6vMuXLUT9J3o for ; Wed, 1 Nov 2017 04:13:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id E146C5FDFF for ; Wed, 1 Nov 2017 04:13:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6DD63E0E80 for ; Wed, 1 Nov 2017 04:13:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1B7ED212F5 for ; Wed, 1 Nov 2017 04:13:00 +0000 (UTC) Date: Wed, 1 Nov 2017 04:13:00 +0000 (UTC) From: "ASF subversion and git services (JIRA)" To: notifications@asterixdb.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ASTERIXDB-2133) Unnecessary BinarySearch in GroupFrameAccessor MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 01 Nov 2017 04:13:05 -0000 [ https://issues.apache.org/jira/browse/ASTERIXDB-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16233628#comment-16233628 ] ASF subversion and git services commented on ASTERIXDB-2133: ------------------------------------------------------------ Commit c04046c11be062c4320ba7e0bbd5258ffb600ad3 in asterixdb's branch refs/heads/master from [~luochen01] [ https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;h=c04046c ] [ASTERIXDB-2133] Fix unncessary binary search in GroupFrameAccessor - user model changes: no - storage format changes: no - interface changes: no Details: - GroupFrameAccessor holds a list of frames from a run during the merge step of merge sort. However, everytime we access a tuple, it performs binary search to get the physical tuple index. This patch fixes this by remembering the last accessed frame. It is expected that tuples are accessed sequentially (since it's the merge step), which greatly reduces binary searches Change-Id: I4a1b19ad47f6b1dda4bd5c417932e4c9ba36a714 Reviewed-on: https://asterix-gerrit.ics.uci.edu/2079 Sonar-Qube: Jenkins Tested-by: Jenkins Contrib: Jenkins Integration-Tests: Jenkins Reviewed-by: Ian Maxon > Unnecessary BinarySearch in GroupFrameAccessor > ---------------------------------------------- > > Key: ASTERIXDB-2133 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-2133 > Project: Apache AsterixDB > Issue Type: Bug > Components: HYR - Hyracks > Reporter: Chen Luo > Assignee: Chen Luo > Priority: Major > > During the merge step of merge sort, if there is enough memory but only a few of runs to be merged, we would load multiple frames per run into the GroupFrameAccessor. Every time when we access a tuple, GroupFrameAccessor performs binary search over the inner frames to translate logical tuple index into the physical one (inner frame Id + index). > However, this is highly inefficient, and partially results in the fact that more memory budget of the sort operation would result in slower performance. Since GroupFrameAccessor is only used by merge sort, it is expected that tuples are accessed sequentially, instead of randomly. Specially optimizations can be adopted based on this sequentially access pattern. -- This message was sent by Atlassian JIRA (v6.4.14#64029)