Return-Path: X-Original-To: apmail-hawq-commits-archive@minotaur.apache.org Delivered-To: apmail-hawq-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0456A1828C for ; Tue, 8 Dec 2015 02:26:59 +0000 (UTC) Received: (qmail 64952 invoked by uid 500); 8 Dec 2015 02:26:58 -0000 Delivered-To: apmail-hawq-commits-archive@hawq.apache.org Received: (qmail 64899 invoked by uid 500); 8 Dec 2015 02:26:58 -0000 Mailing-List: contact commits-help@hawq.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hawq.incubator.apache.org Delivered-To: mailing list commits@hawq.incubator.apache.org Received: (qmail 64887 invoked by uid 99); 8 Dec 2015 02:26:58 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Dec 2015 02:26:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 6B9041A0C0E for ; Tue, 8 Dec 2015 02:26:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.227 X-Spam-Level: * X-Spam-Status: No, score=1.227 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.554, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id i2C9N57ZmkmY for ; Tue, 8 Dec 2015 02:26:53 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with SMTP id 6E5CE439B7 for ; Tue, 8 Dec 2015 02:26:53 +0000 (UTC) Received: (qmail 24233 invoked by uid 99); 8 Dec 2015 02:00:13 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Dec 2015 02:00:13 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 9C8E6E07BA; Tue, 8 Dec 2015 02:00:13 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: hubertzhang@apache.org To: commits@hawq.incubator.apache.org Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: incubator-hawq git commit: HAWQ-227. Data locality downgrade by wrong insert host. Date: Tue, 8 Dec 2015 02:00:13 +0000 (UTC) Repository: incubator-hawq Updated Branches: refs/heads/master afd0e554c -> 72b349b4b HAWQ-227. Data locality downgrade by wrong insert host. Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-hawq/commit/72b349b4 Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq/tree/72b349b4 Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq/diff/72b349b4 Branch: refs/heads/master Commit: 72b349b4bb6421e97ad2b838344a4d1b5be475fc Parents: afd0e55 Author: hubertzhang Authored: Tue Dec 8 09:57:43 2015 +0800 Committer: hubertzhang Committed: Tue Dec 8 09:57:43 2015 +0800 ---------------------------------------------------------------------- src/backend/cdb/cdbdatalocality.c | 68 ++++++++++++++++++---------------- 1 file changed, 36 insertions(+), 32 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/72b349b4/src/backend/cdb/cdbdatalocality.c ---------------------------------------------------------------------- diff --git a/src/backend/cdb/cdbdatalocality.c b/src/backend/cdb/cdbdatalocality.c index 3e1cee7..3cacd74 100644 --- a/src/backend/cdb/cdbdatalocality.c +++ b/src/backend/cdb/cdbdatalocality.c @@ -2659,42 +2659,46 @@ static void allocate_random_relation(Relation_Data* rel_data, /*find the insert node for each block*/ - int *hostOccurTimes = (int *) palloc( - sizeof(int) * context->dds_context.size); + int *hostOccurTimes = (int *) palloc(sizeof(int) * context->dds_context.size); for (int fi = 0; fi < fileCount; fi++) { - Relation_File *rel_file = file_vector[fi]; - /*for hash file whose bucket number doesn't equal to segment number*/ - if (rel_file->hostIDs == NULL) { - rel_file->splits[0].host = 0; - continue; - } - MemSet(hostOccurTimes, 0, sizeof(int) * context->dds_context.size); - for (i = 0; i < rel_file->split_num; i++) { - Block_Host_Index *hostID = rel_file->hostIDs + i; - for (int l = 0; l < hostID->replica_num; l++) { - uint32_t key = hostID->hostIndex[l]; - hostOccurTimes[key]++; - } + Relation_File *rel_file = file_vector[fi]; + /*for hash file whose bucket number doesn't equal to segment number*/ + if (rel_file->hostIDs == NULL) { + rel_file->splits[0].host = 0; + continue; + } + MemSet(hostOccurTimes, 0, sizeof(int) * context->dds_context.size); + for (i = 0; i < rel_file->split_num; i++) { + Block_Host_Index *hostID = rel_file->hostIDs + i; + for (int l = 0; l < hostID->replica_num; l++) { + uint32_t key = hostID->hostIndex[l]; + hostOccurTimes[key]++; } - int maxOccurTime = -1; - int inserthost = -1; - for(int i=0;i< context->dds_context.size;i++){ - if(hostOccurTimes[i] > maxOccurTime){ - maxOccurTime = hostOccurTimes[i]; - inserthost = i; - } + } + int maxOccurTime = -1; + int inserthost = -1; + int hostsWithSameOccurTimesExist = true; + for (int i = 0; i < context->dds_context.size; i++) { + if (hostOccurTimes[i] > maxOccurTime) { + maxOccurTime = hostOccurTimes[i]; + inserthost = i; + hostsWithSameOccurTimesExist = false; + } else if (hostOccurTimes[i] == maxOccurTime) { + hostsWithSameOccurTimesExist = true; } + } - /* currently we consider the insert hosts are the same for all the blocks in the same file. - * this logic can be changed in future, so we store the state in block level not file level*/ - if(maxOccurTime < rel_file->split_num){ - inserthost = -1; - }else{ - for (i = 0; i < rel_file->split_num; i++) { - Block_Host_Index *hostID = rel_file->hostIDs + i; - hostID->insertHost = inserthost; - } - } + /* currently we consider the insert hosts are the same for all the blocks in the same file. + * this logic can be changed in future, so we store the state in block level not file level + * if hostsWithSameOccurTimesExist we cannot determine which is insert host + * if maxOccurTime <2 we cannot determine which is insert host either*/ + if (maxOccurTime < rel_file->split_num || maxOccurTime < 2 || hostsWithSameOccurTimesExist) { + inserthost = -1; + } + for (i = 0; i < rel_file->split_num; i++) { + Block_Host_Index *hostID = rel_file->hostIDs + i; + hostID->insertHost = inserthost; + } } pfree(hostOccurTimes);