Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C7946200B20 for ; Wed, 11 May 2016 20:03:17 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C6700160A19; Wed, 11 May 2016 18:03:17 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 21B92160A17 for ; Wed, 11 May 2016 20:03:16 +0200 (CEST) Received: (qmail 25868 invoked by uid 500); 11 May 2016 18:03:16 -0000 Mailing-List: contact notifications-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list notifications@asterixdb.incubator.apache.org Received: (qmail 25859 invoked by uid 99); 11 May 2016 18:03:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2016 18:03:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id CA11A180298 for ; Wed, 11 May 2016 18:03:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.221 X-Spam-Level: X-Spam-Status: No, score=-3.221 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id CMP2V9in-FVd for ; Wed, 11 May 2016 18:03:14 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with SMTP id E01A65F472 for ; Wed, 11 May 2016 18:03:13 +0000 (UTC) Received: (qmail 25826 invoked by uid 99); 11 May 2016 18:03:12 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2016 18:03:12 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CD96B2C14F4 for ; Wed, 11 May 2016 18:03:12 +0000 (UTC) Date: Wed, 11 May 2016 18:03:12 +0000 (UTC) From: "Yingyi Bu (JIRA)" To: notifications@asterixdb.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ASTERIXDB-1433) Multiple cores with huge memory slow down in the big fact table aggregation. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 11 May 2016 18:03:17 -0000 [ https://issues.apache.org/jira/browse/ASTERIXDB-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280525#comment-15280525 ] Yingyi Bu commented on ASTERIXDB-1433: -------------------------------------- [~lwhay] >> However, the running trace results demonstrate that, as compared to the big memory configurations, Is it possible to paste the query and dataset details (e.g, schema, size) here? >> the original tables is always re-loaded from the disk to the actual memory even they have been handled in the latest query. We do have read-only disk buffer cache. Do you have more concrete numbers, e.g., dataset size, number of read/write I/Os (from /proc/...), response time, etc.? Best, Yingyi > Multiple cores with huge memory slow down in the big fact table aggregation. > ---------------------------------------------------------------------------- > > Key: ASTERIXDB-1433 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1433 > Project: Apache AsterixDB > Issue Type: Improvement > Components: Hyracks Core > Environment: 10 nodes X Linux ubuntu/6 cpu X 4 cores/per cpu, 128 GB memory/per node. > Reporter: Wenhai > > This is a classic hardware platform that shoes up the TB scale of dataset in total. AsterixDB does extremely well for the complex query that includes multiple join operators over a high-selectivity select operator. However, the running trace results demonstrate that, as compared to the big memory configurations, the original tables is always re-loaded from the disk to the actual memory even they have been handled in the latest query. To this end, why not provide the strategy to keep the intermediate data of the last completed query into the memory and free them in case the memory is not enough for the newly query. In some case, the user will always trigger the query with the different parameters on the same tables, for example, the variant-parameter aggregation on the single big fact table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)