Return-Path: X-Original-To: apmail-hawq-commits-archive@minotaur.apache.org Delivered-To: apmail-hawq-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5E2919541 for ; Mon, 21 Mar 2016 22:01:48 +0000 (UTC) Received: (qmail 8216 invoked by uid 500); 21 Mar 2016 22:01:48 -0000 Delivered-To: apmail-hawq-commits-archive@hawq.apache.org Received: (qmail 8170 invoked by uid 500); 21 Mar 2016 22:01:48 -0000 Mailing-List: contact commits-help@hawq.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hawq.incubator.apache.org Delivered-To: mailing list commits@hawq.incubator.apache.org Received: (qmail 8161 invoked by uid 99); 21 Mar 2016 22:01:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Mar 2016 22:01:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id E2AECC021D for ; Mon, 21 Mar 2016 22:01:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.221 X-Spam-Level: X-Spam-Status: No, score=-3.221 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id zThjIxGHBcXu for ; Mon, 21 Mar 2016 22:01:34 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id 9C54D5FB0C for ; Mon, 21 Mar 2016 22:01:31 +0000 (UTC) Received: (qmail 1545 invoked by uid 99); 21 Mar 2016 22:01:31 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Mar 2016 22:01:31 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id C162FDFA43; Mon, 21 Mar 2016 22:01:30 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: shivram@apache.org To: commits@hawq.incubator.apache.org Date: Mon, 21 Mar 2016 22:01:37 -0000 Message-Id: <5c0cdfb4ac7d4cf7a79d3fb9f0061ed7@git.apache.org> In-Reply-To: <7dd36c8d476046af952cee0ed0085296@git.apache.org> References: <7dd36c8d476046af952cee0ed0085296@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [08/50] incubator-hawq git commit: HAWQ-473. Implement adding a entry into gp_configuration_history and add a description column to show the reason of this segment is down HAWQ-473. Implement adding a entry into gp_configuration_history and add a description column to show the reason of this segment is down Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-hawq/commit/d1780628 Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq/tree/d1780628 Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq/diff/d1780628 Branch: refs/heads/HAWQ-459 Commit: d178062876f0c55db7b1ac74946bd2ba096b779f Parents: 2a15066 Author: Wen Lin Authored: Tue Mar 15 10:00:16 2016 +0800 Committer: Wen Lin Committed: Tue Mar 15 10:00:16 2016 +0800 ---------------------------------------------------------------------- .../communication/rmcomm_RM2RMSEG.c | 28 +- .../resourcemanager/include/resourcepool.h | 43 +- src/backend/resourcemanager/requesthandler.c | 109 ++- .../resourcemanager/requesthandler_RMSEG.c | 1 - .../resourcebroker/resourcebroker_LIBYARN.c | 82 +- .../resourcebroker_LIBYARN_proc.c | 1 - src/backend/resourcemanager/resourcemanager.c | 91 +- src/backend/resourcemanager/resourcepool.c | 857 ++++++++++++------- src/backend/resourcemanager/resqueuemanager.c | 3 +- src/backend/utils/gp/segadmin.c | 3 +- src/include/catalog/gp_configuration.h | 27 +- src/include/catalog/gp_segment_config.h | 11 +- src/include/catalog/pg_proc.h | 5 +- src/include/utils/builtins.h | 1 + tools/bin/gppylib/data/2.0.json | 32 +- 15 files changed, 823 insertions(+), 471 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/communication/rmcomm_RM2RMSEG.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/communication/rmcomm_RM2RMSEG.c b/src/backend/resourcemanager/communication/rmcomm_RM2RMSEG.c index fe4a595..ca332c9 100644 --- a/src/backend/resourcemanager/communication/rmcomm_RM2RMSEG.c +++ b/src/backend/resourcemanager/communication/rmcomm_RM2RMSEG.c @@ -231,11 +231,23 @@ void receivedRUAliveResponse(AsyncCommMessageHandlerContext context, * This call makes resource pool remove unused containers. */ returnAllGRMResourceFromSegment(segres); + + segres->Stat->StatusDesc |= SEG_STATUS_FAILED_PROBING_SEGMENT; /* Set the host down in gp_segment_configuration table */ if (Gp_role != GP_ROLE_UTILITY) { + SimpStringPtr description = build_segment_status_description(segres->Stat); update_segment_status(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, - SEGMENT_STATUS_DOWN); + SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + add_segment_history_row(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + GET_SEGRESOURCE_HOSTNAME(segres), + description->Str); + if (description != NULL) + { + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); + } } /* Set the host down. */ elog(WARNING, "Resource manager sets host %s from up to down " @@ -281,10 +293,22 @@ void sentRUAliveError(AsyncCommMessageHandlerContext context) * This call makes resource pool remove unused containers. */ returnAllGRMResourceFromSegment(segres); + segres->Stat->StatusDesc |= SEG_STATUS_COMMUNICATION_ERROR; /* Set the host down in gp_segment_configuration table */ if (Gp_role != GP_ROLE_UTILITY) { - update_segment_status(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, SEGMENT_STATUS_DOWN); + SimpStringPtr description = build_segment_status_description(segres->Stat); + update_segment_status(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + add_segment_history_row(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + GET_SEGRESOURCE_HOSTNAME(segres), + description->Str); + if (description != NULL) + { + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); + } } /* Set the host down. */ elog(LOG, "Resource manager sets host %s from up to down " http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/include/resourcepool.h ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/include/resourcepool.h b/src/backend/resourcemanager/include/resourcepool.h index 6e1412c..5feedd3 100644 --- a/src/backend/resourcemanager/include/resourcepool.h +++ b/src/backend/resourcemanager/include/resourcepool.h @@ -145,13 +145,14 @@ struct SegStatData { int32_t ID; /* Internal ID. */ uint16_t FailedTmpDirNum; /* Failed temporary directory number */ uint8_t FTSAvailable; /* If it is available now. */ - uint8_t GRMAvailable; /* If it is global resource available.*/ - + uint8_t GRMHandled; /* If its GRM status is handled */ uint32_t FTSTotalMemoryMB; /* FTS reports memory capacity. */ uint32_t FTSTotalCore; /* FTS reports core capacity. */ uint32_t GRMTotalMemoryMB; /* GRM reports memory capacity. */ uint32_t GRMTotalCore; /* GRM reports core capacity. */ uint64_t RMStartTimestamp; /* RM process reset timestamp */ + uint32_t StatusDesc; /* Description of status */ + uint32_t Reserved; SegInfoData Info; /* 64-bit aligned. */ }; @@ -168,12 +169,9 @@ enum SegAvailabilityStatus { }; int setSegStatHAWQAvailability( SegStat machine, uint8_t newstatus); -int setSegStatGLOBAvailability( SegStat machine, uint8_t newstatus); #define IS_SEGSTAT_FTSAVAILABLE(seg) \ ((seg)->FTSAvailable == RESOURCE_SEG_STATUS_AVAILABLE) -#define IS_SEGSTAT_GRMAVAILABLE(seg) \ - ((seg)->GRMAvailable == RESOURCE_SEG_STATUS_AVAILABLE) /* Generate SegStat instance's report as a string saved in self maintained buffer. */ void generateSegStatReport(SegStat segstat, SelfMaintainBuffer buff); @@ -610,7 +608,7 @@ int returnResourceToResourcePool(int memory, void returnAllGRMResourceFromSegment(SegResource segres); void dropAllGRMContainersFromSegment(SegResource segres); -void returnAllGRMResourceFromGRMUnavailableSegments(void); +void returnAllGRMResourceFromUnavailableSegments(void); void generateSegResourceReport(int32_t nodeid, SelfMaintainBuffer buff); @@ -632,7 +630,7 @@ int addOrderedResourceAvailTreeIndexByRatio(uint32_t ratio, BBST *tree); int getOrderedResourceAvailTreeIndexByRatio(uint32_t ratio, BBST *tree); int getOrderedResourceAllocTreeIndexByRatio(uint32_t ratio, BBST *tree); -void setAllSegResourceGRMUnavailable(void); +void setAllSegResourceGRMUnhandled(void); void resetAllSegmentsGRMContainerFailAllocCount(void); @@ -671,13 +669,11 @@ void adjustMemoryCoreValue(uint32_t *memorymb, uint32_t *core); /* Clean up gp_segment_configuration */ void cleanup_segment_config(void); + +#define SEG_STATUS_DESCRIPTION_UP "segment is up" /* update a segment's status in gp_segment_configuration table */ -void update_segment_status(int32_t id, char status); -/* update a segment's status and failed temporary directory - * in gp_segment_configuration table - */ -void update_segment_failed_tmpdir -(int32_t id, char status, int32_t failedNum, char* failedTmpDir); +void update_segment_status(int32_t id, char status, char* description); + /* Add a new entry into gp_segment_configuration table*/ void add_segment_config_row(int32_t id, char *hostname, @@ -685,8 +681,25 @@ void add_segment_config_row(int32_t id, uint32_t port, char role, char status, - uint32_t failed_tmpdir_num, - char* failed_tmpdir); + char* description); + +/* + * SegStatData's StatusDesc is a combination of below flags + */ +#define SEG_STATUS_HEARTBEAT_TIMEOUT 0x00000001 +#define SEG_STATUS_FAILED_PROBING_SEGMENT 0x00000002 +#define SEG_STATUS_COMMUNICATION_ERROR 0x00000004 +#define SEG_STATUS_FAILED_TMPDIR 0x00000008 +#define SEG_STATUS_RM_RESET 0x00000010 +#define SEG_STATUS_NO_GRM_NODE_REPORT 0x00000020 + +/* Add a new entry into gp_configuration_history table */ +void add_segment_history_row(int32_t id, + char* hostname, + char* description); + +/* build a string of status description based on SegStat */ +SimpStringPtr build_segment_status_description(SegStat segstat); /* * In resource pool, segment's id starts from 0, however in gp_segment_configuration table, http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/requesthandler.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/requesthandler.c b/src/backend/resourcemanager/requesthandler.c index ba35e2e..a06d169 100644 --- a/src/backend/resourcemanager/requesthandler.c +++ b/src/backend/resourcemanager/requesthandler.c @@ -746,29 +746,13 @@ bool handleRMSEGRequestIMAlive(void **arg) destroySelfMaintainBuffer(&newseginfo); newsegstat->ID = SEGSTAT_ID_INVALID; - newsegstat->GRMAvailable = RESOURCE_SEG_STATUS_UNSET; RPCRequestHeadIMAlive header = SMBUFF_HEAD(RPCRequestHeadIMAlive, &(conntrack->MessageBuff)); newsegstat->FailedTmpDirNum = header->TmpDirBrokenCount; newsegstat->RMStartTimestamp = header->RMStartTimestamp; - - /* - * Check if the there is any failed temporary directory on this segment. - * if has, master considers this segment as down, even it has heart-beat report. - */ - if (newsegstat->FailedTmpDirNum == 0) - { - newsegstat->FTSAvailable = RESOURCE_SEG_STATUS_AVAILABLE; - } - else - { - elog(RMLOG, "Resource manager finds there is %d failed temporary directories " - "on this segment, " - "so mark this segment unavailable.", - newsegstat->FailedTmpDirNum); - newsegstat->FTSAvailable = RESOURCE_SEG_STATUS_UNAVAILABLE; - } + newsegstat->StatusDesc = 0; + newsegstat->Reserved = 0; bool capstatchanged = false; if ( addHAWQSegWithSegStat(newsegstat, &capstatchanged) != FUNC_RETURN_OK ) @@ -977,13 +961,13 @@ bool handleRMRequestSegmentIsDown(void **arg) while( (hostname - SMBUFF_CONTENT(&(conntrack->MessageBuff)) < getSMBContentSize(&(conntrack->MessageBuff))) && - *hostname != '\0' ) + *hostname != '\0' ) { hostnamelen = strlen(hostname); res = getSegIDByHostName(hostname, hostnamelen, &segid); if ( res == FUNC_RETURN_OK ) { - /* Get resourceinfo of the expected host. */ + /* Get resource info of the expected host. */ SegResource segres = getSegResource(segid); Assert( segres != NULL ); @@ -1002,45 +986,56 @@ bool handleRMRequestSegmentIsDown(void **arg) else { elog(RMLOG, "Resource manager probes the status of host %s by " - "sending RUAlive request.", + "sending RUAlive request.", hostname); - res = sendRUAlive(hostname); - /* IN THIS CASE, the segment is considered as down. */ - if (res != FUNC_RETURN_OK) - { - /*---------------------------------------------------------- - * This call makes resource manager able to adjust queue and - * mem/core trackers' capacity. - *---------------------------------------------------------- - */ - setSegResHAWQAvailability(segres, - RESOURCE_SEG_STATUS_UNAVAILABLE); - - /* Make resource pool remove unused containers */ - returnAllGRMResourceFromSegment(segres); - /* Set the host down in gp_segment_configuration table */ - if (Gp_role != GP_ROLE_UTILITY) - { - update_segment_status(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, - SEGMENT_STATUS_DOWN); - } - - /* Set the host down. */ - elog(LOG, "Resource manager sets host %s from up to down " - "due to not reaching host.", hostname); - } - else - { - elog(RMLOG, "Resource manager triggered RUAlive request to " - "host %s.", + res = sendRUAlive(hostname); + /* IN THIS CASE, the segment is considered as down. */ + if (res != FUNC_RETURN_OK) + { + /*---------------------------------------------------------- + * This call makes resource manager able to adjust queue and + * mem/core trackers' capacity. + *---------------------------------------------------------- + */ + setSegResHAWQAvailability(segres, + RESOURCE_SEG_STATUS_UNAVAILABLE); + + /* Make resource pool remove unused containers */ + returnAllGRMResourceFromSegment(segres); + /* Set the host down in gp_segment_configuration table */ + segres->Stat->StatusDesc |= SEG_STATUS_FAILED_PROBING_SEGMENT; + if (Gp_role != GP_ROLE_UTILITY) + { + SimpStringPtr description = build_segment_status_description(segres->Stat); + update_segment_status(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + add_segment_history_row(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + hostname, + description->Str); + if (description != NULL) + { + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); + } + } + + /* Set the host down. */ + elog(LOG, "Resource manager sets host %s from up to down " + "due to not reaching host.", hostname); + } + else + { + elog(RMLOG, "Resource manager triggered RUAlive request to " + "host %s.", hostname); - } + } } } else { elog(WARNING, "Resource manager cannot find host %s to check status, " - "skip it.", + "skip it.", hostname); } @@ -1124,7 +1119,7 @@ bool handleRMRequestTmpDir(void **arg) RESPONSE_QD_TMPDIR); elog(LOG, "Resource manager assigned temporary directory %s", - tmpdir->Str); + tmpdir->Str); } conntrack->ResponseSent = false; @@ -1271,11 +1266,11 @@ bool handleRMRequestDummy(void **arg) sizeof(response), conntrack->MessageMark1, conntrack->MessageMark2, - RESPONSE_DUMMY); + RESPONSE_DUMMY); conntrack->ResponseSent = false; - MEMORY_CONTEXT_SWITCH_TO(PCONTEXT) - PCONTRACK->ConnToSend = lappend(PCONTRACK->ConnToSend, conntrack); - MEMORY_CONTEXT_SWITCH_BACK + MEMORY_CONTEXT_SWITCH_TO(PCONTEXT) + PCONTRACK->ConnToSend = lappend(PCONTRACK->ConnToSend, conntrack); + MEMORY_CONTEXT_SWITCH_BACK return true; } http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/requesthandler_RMSEG.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/requesthandler_RMSEG.c b/src/backend/resourcemanager/requesthandler_RMSEG.c index e667c1a..90946a0 100644 --- a/src/backend/resourcemanager/requesthandler_RMSEG.c +++ b/src/backend/resourcemanager/requesthandler_RMSEG.c @@ -182,7 +182,6 @@ int refreshLocalHostInstance(void) RMSEG_INBUILDHOST->GRMTotalCore = 0.0; RMSEG_INBUILDHOST->FTSTotalMemoryMB = DRMGlobalInstance->SegmentMemoryMB; RMSEG_INBUILDHOST->FTSTotalCore = DRMGlobalInstance->SegmentCore; - RMSEG_INBUILDHOST->GRMAvailable = RESOURCE_SEG_STATUS_UNSET; RMSEG_INBUILDHOST->FTSAvailable = RESOURCE_SEG_STATUS_AVAILABLE; RMSEG_INBUILDHOST->ID = SEGSTAT_ID_INVALID; RMSEG_INBUILDHOST->FailedTmpDirNum = failedTmpDirNum; http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN.c b/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN.c index 1e8c7ac..d446a6d 100644 --- a/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN.c +++ b/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN.c @@ -536,7 +536,9 @@ int handleRB2RM_ClusterReport(void) uint32_t segsize; int fd = ResBrokerNotifyPipe[0]; int piperes = 0; - List *segstats = NULL; + List *segstats = NULL; + List *allsegres = NULL; + ListCell *cell = NULL; PRESPOOL->RBClusterReportCounter++; @@ -642,11 +644,7 @@ int handleRB2RM_ClusterReport(void) return res; } - /* - * Set all current segments GRM unavailable, only the segments identified by - * one segment in segstats are available. - */ - setAllSegResourceGRMUnavailable(); + setAllSegResourceGRMUnhandled(); /* * Start to update resource pool content. The YARN cluster total size is @@ -683,6 +681,66 @@ int handleRB2RM_ClusterReport(void) rm_pfree(PCONTEXT, segstat); segstats = list_delete_first(segstats); } + + /* + * iterate all segments without GRM report, + * and update its status. + */ + getAllPAIRRefIntoList(&(PRESPOOL->Segments), &allsegres); + foreach(cell, allsegres) + { + SegResource segres = (SegResource)(((PAIR)lfirst(cell))->Value); + bool statusDescChange = false; + + /* + * skip segments handled in GRM report list + */ + if (segres->Stat->GRMHandled) + continue; + + /* + * Set no GRM node report flag for this segment. + */ + if ((segres->Stat->StatusDesc & SEG_STATUS_NO_GRM_NODE_REPORT) == 0) + { + segres->Stat->StatusDesc |= SEG_STATUS_NO_GRM_NODE_REPORT; + statusDescChange = true; + } + + if (IS_SEGSTAT_FTSAVAILABLE(segres->Stat)) + { + /* + * This segment is FTS available, but master hasn't + * gotten its GRM node report, so set this segment to DOWN. + */ + setSegResHAWQAvailability(segres, RESOURCE_SEG_STATUS_UNAVAILABLE); + } + + Assert(!IS_SEGSTAT_FTSAVAILABLE(segres->Stat)); + if (statusDescChange && Gp_role != GP_ROLE_UTILITY) + { + SimpStringPtr description = build_segment_status_description(segres->Stat); + update_segment_status(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + add_segment_history_row(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + GET_SEGRESOURCE_HOSTNAME(segres), + description->Str); + + elog(LOG, "Resource manager hasn't gotten GRM node report for segment(%s)," + "updates its status:'%c', description:%s", + GET_SEGRESOURCE_HOSTNAME(segres), + SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + if (description != NULL) + { + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); + } + } + } + freePAIRRefList(&(PRESPOOL->Segments), &allsegres); + MEMORY_CONTEXT_SWITCH_BACK elog(LOG, "Resource manager YARN resource broker counted HAWQ cluster now " @@ -694,10 +752,10 @@ int handleRB2RM_ClusterReport(void) PRESPOOL->GRMTotalHavingNoHAWQNode.Core); /* - * If the segment is not GRM available, RM should return all containers - * located upon them. + * If the segment is GRM unavailable or FTS unavailable, + * RM should return all containers located upon them. */ - returnAllGRMResourceFromGRMUnavailableSegments(); + returnAllGRMResourceFromUnavailableSegments(); /* Refresh available node count. */ refreshAvailableNodeCount(); @@ -706,7 +764,7 @@ int handleRB2RM_ClusterReport(void) PQUEMGR->GRMQueueCapacity = response.QueueCapacity; PQUEMGR->GRMQueueCurCapacity = response.QueueCurCapacity; PQUEMGR->GRMQueueMaxCapacity = (response.QueueMaxCapacity > 0 && - response.QueueMaxCapacity <= 1) ? + response.QueueMaxCapacity <= 1) ? response.QueueMaxCapacity : PQUEMGR->GRMQueueMaxCapacity; PQUEMGR->GRMQueueResourceTight = response.ResourceTight > 0 ? true : false; @@ -714,9 +772,9 @@ int handleRB2RM_ClusterReport(void) refreshResourceQueueCapacity(false); refreshActualMinGRMContainerPerSeg(); - PRESPOOL->LastUpdateTime = gettime_microsec(); + PRESPOOL->LastUpdateTime = gettime_microsec(); - return FUNC_RETURN_OK; + return FUNC_RETURN_OK; } /* http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN_proc.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN_proc.c b/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN_proc.c index 3afae7b..f99ed48 100644 --- a/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN_proc.c +++ b/src/backend/resourcemanager/resourcebroker/resourcebroker_LIBYARN_proc.c @@ -1482,7 +1482,6 @@ int RB2YARN_getClusterReport(DQueue hosts) segstat->ID = SEGSTAT_ID_INVALID; segstat->FTSAvailable = RESOURCE_SEG_STATUS_UNSET; - segstat->GRMAvailable = RESOURCE_SEG_STATUS_AVAILABLE; segstat->GRMTotalMemoryMB = pnodereport->memoryCapability; segstat->GRMTotalCore = pnodereport->vcoresCapability; segstat->FTSTotalMemoryMB = 0; http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/resourcemanager.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/resourcemanager.c b/src/backend/resourcemanager/resourcemanager.c index 9a4bb85..138c5a0 100644 --- a/src/backend/resourcemanager/resourcemanager.c +++ b/src/backend/resourcemanager/resourcemanager.c @@ -489,7 +489,6 @@ int ResManagerMainServer2ndPhase(void) PostPortNumber, SEGMENT_ROLE_MASTER_CONFIG, SEGMENT_STATUS_UP, - 0, ""); /* Load queue and user definition as no DDL now. */ @@ -2096,12 +2095,9 @@ int generateAllocRequestToBroker(void) /* * Resource manager skips this segment if * 1) Not FTS available; - * 2) Not GRM available; - * 3) Having resource decrease pending. + * 2) Having resource decrease pending. */ if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) || - (DRMGlobalInstance->ImpType != NONE_HAWQ2 && - !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) || (segres->DecPending.MemoryMB > 0 && segres->DecPending.Core > 0)) { continue; @@ -2311,12 +2307,9 @@ void completeAllocRequestToBroker(int32_t *reqmem, /* * Resource manager skips this segment if * 1) Not FTS available; - * 2) Not GRM available; - * 3) Having resource decrease pending. + * 2) Having resource decrease pending. */ - if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) || - (DRMGlobalInstance->ImpType != NONE_HAWQ2 && - !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) || + if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) || (segres->DecPending.MemoryMB > 0 && segres->DecPending.Core > 0)) { index++; @@ -2391,12 +2384,9 @@ void completeAllocRequestToBroker(int32_t *reqmem, /* * Resource manager skips this segment if * 1) Not FTS available; - * 2) Not GRM available; - * 3) Having resource decrease pending. + * 2) Having resource decrease pending. */ if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) || - (DRMGlobalInstance->ImpType != NONE_HAWQ2 && - !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) || (segres->DecPending.MemoryMB > 0 && segres->DecPending.Core > 0)) { index++; @@ -2566,31 +2556,51 @@ void updateStatusOfAllNodes() curtime = gettime_microsec(); for(uint32_t idx = 0; idx < PRESPOOL->SegmentIDCounter; idx++) { - node = getSegResource(idx); - if (node != NULL && - (curtime - node->LastUpdateTime > + node = getSegResource(idx); + uint8_t oldStatus = node->Stat->FTSAvailable; + if (node != NULL && + (curtime - node->LastUpdateTime > 1000000LL * rm_segment_heartbeat_timeout) && - IS_SEGSTAT_FTSAVAILABLE(node->Stat) ) - { - /* - * This call makes resource manager able to adjust queue and mem/core - * trackers' capacity. - */ - setSegResHAWQAvailability(node, RESOURCE_SEG_STATUS_UNAVAILABLE); - /* - * This call makes resource pool remove unused containers. - */ - returnAllGRMResourceFromSegment(node); - if (Gp_role != GP_ROLE_UTILITY) - { - update_segment_status(idx + REGISTRATION_ORDER_OFFSET, SEGMENT_STATUS_DOWN); - } + (node->Stat->StatusDesc & SEG_STATUS_HEARTBEAT_TIMEOUT) == 0) + { + /* + * This segment is heartbeat timeout, update its description + * and set it to unavailable if needed. + */ + if (oldStatus == RESOURCE_SEG_STATUS_AVAILABLE) + { + /* + * This call makes resource manager able to adjust queue and mem/core + * trackers' capacity. + */ + setSegResHAWQAvailability(node, RESOURCE_SEG_STATUS_UNAVAILABLE); + /* + * This call makes resource pool remove unused containers. + */ + returnAllGRMResourceFromSegment(node); + changedstatus = true; + } - elog(WARNING, "Resource manager sets host %s from up to down.", - GET_SEGRESOURCE_HOSTNAME(node)); + node->Stat->StatusDesc |= SEG_STATUS_HEARTBEAT_TIMEOUT; + if (Gp_role != GP_ROLE_UTILITY) + { + SimpStringPtr description = build_segment_status_description(node->Stat); + update_segment_status(idx + REGISTRATION_ORDER_OFFSET, + SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + add_segment_history_row(idx + REGISTRATION_ORDER_OFFSET, + GET_SEGRESOURCE_HOSTNAME(node), + (description->Len > 0)?description->Str:""); + if (description != NULL) + { + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); + } + } - changedstatus = true; - } + elog(WARNING, "Resource manager sets host %s heartbeat timeout.", + GET_SEGRESOURCE_HOSTNAME(node)); + } } if ( changedstatus ) @@ -2711,11 +2721,10 @@ int loadHostInformationIntoResourcePool(void) /* Build machine info instance. */ SegStat segstat = (SegStat)rm_palloc0(PCONTEXT, - offsetof(SegStatData, Info) + - seginfobuff.Cursor + 1); - segstat->ID = SEGSTAT_ID_INVALID; - segstat->GRMAvailable = RESOURCE_SEG_STATUS_UNSET; - segstat->FTSAvailable = RESOURCE_SEG_STATUS_AVAILABLE; + offsetof(SegStatData, Info) + + seginfobuff.Cursor + 1); + segstat->ID = SEGSTAT_ID_INVALID; + segstat->FTSAvailable = RESOURCE_SEG_STATUS_AVAILABLE; segstat->FTSTotalMemoryMB = DRMGlobalInstance->SegmentMemoryMB; segstat->FTSTotalCore = DRMGlobalInstance->SegmentCore; segstat->GRMTotalMemoryMB = 0; http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/resourcepool.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/resourcepool.c b/src/backend/resourcemanager/resourcepool.c index 329a5c6..6a36801 100644 --- a/src/backend/resourcemanager/resourcepool.c +++ b/src/backend/resourcemanager/resourcepool.c @@ -26,6 +26,8 @@ #include "resourcepool.h" #include "gp-libpq-fe.h" #include "gp-libpq-int.h" +#include "utils/builtins.h" +#include "catalog/pg_proc.h" void addSegResourceAvailIndex(SegResource segres); void addSegResourceAllocIndex(SegResource segres); @@ -467,12 +469,99 @@ cleanup: PQclear(result); PQfinish(conn); } + +/* + * Remove all entries in gp_configuration_history. + * + * gp_remove_segment_history() + * + * Returns: + * true upon success, otherwise throws error. + */ +Datum +gp_remove_segment_history(PG_FUNCTION_ARGS) +{ + int libpqres = CONNECTION_OK; + PGconn *conn = NULL; + char conninfo[512]; + PQExpBuffer sql = NULL; + PGresult* result = NULL; + int ret = true; + + if (!superuser()) + elog(ERROR, "gp_remove_segment_history can only be run by a superuser"); + + if (Gp_role != GP_ROLE_UTILITY) + elog(ERROR, "gp_remove_segment_history can only be run in utility mode"); + + sprintf(conninfo, "options='-c gp_session_role=UTILITY -c allow_system_table_mods=dml' " + "dbname=template1 port=%d connect_timeout=%d", master_addr_port, CONNECT_TIMEOUT); + conn = PQconnectdb(conninfo); + if ((libpqres = PQstatus(conn)) != CONNECTION_OK) + { + elog(WARNING, "Fail to connect database when cleanup " + "segment configuration catalog table, error code: %d, %s", + libpqres, + PQerrorMessage(conn)); + PQfinish(conn); + PG_RETURN_BOOL(false); + } + + result = PQexec(conn, "BEGIN"); + if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) + { + elog(WARNING, "Fail to run SQL: %s when cleanup " + "segment history catalog table, reason : %s", + "BEGIN", + PQresultErrorMessage(result)); + ret = false; + goto cleanup; + } + PQclear(result); + + sql = createPQExpBuffer(); + appendPQExpBuffer(sql,"DELETE FROM gp_configuration_history"); + result = PQexec(conn, sql->data); + if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) + { + elog(WARNING, "Fail to run SQL: %s when cleanup " + "segment history catalog table, reason : %s", + sql->data, + PQresultErrorMessage(result)); + ret = false; + goto cleanup; + } + PQclear(result); + + result = PQexec(conn, "COMMIT"); + if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) + { + elog(WARNING, "Fail to run SQL: %s when cleanup " + "segment history catalog table, reason : %s", + "COMMIT", + PQresultErrorMessage(result)); + ret = false; + goto cleanup; + } + + elog(LOG, "Cleanup segment history catalog table successfully!"); + +cleanup: + if(sql) + destroyPQExpBuffer(sql); + if(result) + PQclear(result); + PQfinish(conn); + + PG_RETURN_BOOL(ret); +} + /* * update a segment's status in gp_segment_configuration table. * id : registration order of this segment * status : new status of this segment */ -void update_segment_status(int32_t id, char status) +void update_segment_status(int32_t id, char status, char* description) { int libpqres = CONNECTION_OK; PGconn *conn = NULL; @@ -480,15 +569,22 @@ void update_segment_status(int32_t id, char status) PQExpBuffer sql = NULL; PGresult* result = NULL; + if (status == SEGMENT_STATUS_UP) + Assert(strlen(description) == 0); + else if (status == SEGMENT_STATUS_DOWN) + Assert(strlen(description) != 0); + else + Assert(0); + sprintf(conninfo, "options='-c gp_session_role=UTILITY -c allow_system_table_mods=dml' " "dbname=template1 port=%d connect_timeout=%d", master_addr_port, CONNECT_TIMEOUT); conn = PQconnectdb(conninfo); if ((libpqres = PQstatus(conn)) != CONNECTION_OK) { elog(WARNING, "Fail to connect database when update segment's status " - "in segment configuration catalog table, error code: %d, %s", - libpqres, - PQerrorMessage(conn)); + "in segment configuration catalog table, error code: %d, %s", + libpqres, + PQerrorMessage(conn)); PQfinish(conn); return; } @@ -497,23 +593,24 @@ void update_segment_status(int32_t id, char status) if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) { elog(WARNING, "Fail to run SQL: %s when update segment's status " - "in segment configuration catalog table, reason : %s", - "BEGIN", - PQresultErrorMessage(result)); + "in segment configuration catalog table, reason : %s", + "BEGIN", + PQresultErrorMessage(result)); goto cleanup; } PQclear(result); sql = createPQExpBuffer(); - appendPQExpBuffer(sql,"UPDATE gp_segment_configuration SET status='%c' WHERE registration_order=%d", - status, id); + appendPQExpBuffer(sql, "UPDATE gp_segment_configuration SET status='%c', " + "description='%s' WHERE registration_order=%d", + status, description, id); result = PQexec(conn, sql->data); if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) { elog(WARNING, "Fail to run SQL: %s when update segment's status " - "in segment configuration catalog table, reason : %s", - sql->data, - PQresultErrorMessage(result)); + "in segment configuration catalog table, reason : %s", + sql->data, + PQresultErrorMessage(result)); goto cleanup; } PQclear(result); @@ -528,9 +625,10 @@ void update_segment_status(int32_t id, char status) goto cleanup; } - elog(LOG, "Update a segment's status to '%c' in segment configuration catalog table," - "registration_order : %d", - status, id); + elog(LOG, "Update a segment(registration_order:%d)'s status to '%c'," + "description to '%s', " + "in segment configuration catalog table,", + id, status, description); cleanup: if(sql) @@ -541,19 +639,17 @@ cleanup: } /* - * update a segment's status and failed tmp dir - * in gp_segment_configuration table. + * Insert a row into gp_configuration_history, + * to record the status change of a segment. * id : registration order of this segment - * status : new status of this segment - * failedNum : number of failed temporary directory - * failedTmpDir : failed temporary directory list, separated by comma + * hostname : hostname of this segment + * description : description of status change */ -void update_segment_failed_tmpdir -(int32_t id, char status, int32_t failedNum, char* failedTmpDir) +void add_segment_history_row(int32_t id, char* hostname, char* description) { int libpqres = CONNECTION_OK; PGconn *conn = NULL; - char conninfo[512]; + char conninfo[1024]; PQExpBuffer sql = NULL; PGresult* result = NULL; @@ -562,8 +658,8 @@ void update_segment_failed_tmpdir conn = PQconnectdb(conninfo); if ((libpqres = PQstatus(conn)) != CONNECTION_OK) { - elog(WARNING, "Fail to connect database when update segment's failed tmpdir " - "in segment configuration catalog table, error code: %d, %s", + elog(WARNING, "Fail to connect database when add a new row into " + "segment configuration history catalog table, error code: %d, %s", libpqres, PQerrorMessage(conn)); PQfinish(conn); @@ -573,24 +669,28 @@ void update_segment_failed_tmpdir result = PQexec(conn, "BEGIN"); if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) { - elog(WARNING, "Fail to run SQL: %s when update segment's failed tmpdir " - "in segment configuration catalog table, reason : %s", + elog(WARNING, "Fail to run SQL: %s when add a new row into " + "segment configuration history catalog table, reason : %s", "BEGIN", PQresultErrorMessage(result)); goto cleanup; } PQclear(result); + TimestampTz curtime = GetCurrentTimestamp(); + const char *curtimestr = timestamptz_to_str(curtime); sql = createPQExpBuffer(); - appendPQExpBuffer(sql, "UPDATE gp_segment_configuration SET " - "status='%c', failed_tmpdir_num = '%d', failed_tmpdir = '%s' " - "WHERE registration_order=%d", - status, failedNum, failedTmpDir, id); + appendPQExpBuffer(sql, + "INSERT INTO gp_configuration_history" + "(time, registration_order, hostname, description) " + "VALUES ('%s','%d','%s','%s')", + curtimestr, id, hostname, description); + result = PQexec(conn, sql->data); if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) { - elog(WARNING, "Fail to run SQL: %s when update segment's failed tmpdir " - "in segment configuration catalog table, reason : %s", + elog(WARNING, "Fail to run SQL: %s when add a new row into " + "segment configuration history catalog table, reason : %s", sql->data, PQresultErrorMessage(result)); goto cleanup; @@ -600,18 +700,16 @@ void update_segment_failed_tmpdir result = PQexec(conn, "COMMIT"); if (!result || PQresultStatus(result) != PGRES_COMMAND_OK) { - elog(WARNING, "Fail to run SQL: %s when update segment's failed tmpdir " - "in segment configuration catalog table, reason : %s", + elog(WARNING, "Fail to run SQL: %s when add a new row into " + "segment configuration history catalog table, reason : %s", "COMMIT", PQresultErrorMessage(result)); goto cleanup; } - elog(LOG, "Update a segment's failed tmpdir:" - "status to '%c', failed_tmpdir_num to '%d', failed_tmpdir to '%s' " - "in segment configuration catalog table," - "registration_order : %d", - status, failedNum, failedTmpDir, id); + elog(LOG, "Add a new row into segment configuration history catalog table," + "time: %s, registration order:%d, hostname:%s, description:%s", + curtimestr, id, hostname, description); cleanup: if(sql) @@ -629,8 +727,7 @@ cleanup: * port : port of this segment * role : role of this segment * status : up or down - * failed_tmpdir_num : the number of failed temporary directory - * failed_tmpdir : failed temporary directory, separated by comma + * description : the description of this segment's current status */ void add_segment_config_row(int32_t id, char* hostname, @@ -638,9 +735,7 @@ void add_segment_config_row(int32_t id, uint32_t port, char role, char status, - uint32_t - failed_tmpdir_num, - char* failed_tmpdir) + char* description) { int libpqres = CONNECTION_OK; PGconn *conn = NULL; @@ -677,10 +772,10 @@ void add_segment_config_row(int32_t id, { appendPQExpBuffer(sql, "INSERT INTO gp_segment_configuration" - "(registration_order,role,status,port,hostname,address,failed_tmpdir_num,failed_tmpdir) " + "(registration_order,role,status,port,hostname,address,description) " "VALUES " - "(%d,'%c','%c',%d,'%s','%s',%d,'%s')", - id,role,status,port,hostname,address,failed_tmpdir_num,failed_tmpdir); + "(%d,'%c','%c',%d,'%s','%s','%s')", + id,role,status,port,hostname,address,description); } else { @@ -713,8 +808,8 @@ void add_segment_config_row(int32_t id, elog(LOG, "Add a new row into segment configuration catalog table," "registration order:%d, role:%c, status:%c, port:%d, " - "hostname:%s, address:%s, failed_tmpdir_num:%d, failed_tmpdir:%s", - id, role, status, port, hostname, address, failed_tmpdir_num, failed_tmpdir); + "hostname:%s, address:%s, description:%s", + id, role, status, port, hostname, address, description); cleanup: if(sql) @@ -758,41 +853,38 @@ int addHAWQSegWithSegStat(SegStat segstat, bool *capstatchanged) */ hostname = GET_SEGINFO_HOSTNAME(&(segstat->Info)); hostnamelen = segstat->Info.HostNameLen; - res = getSegIDByHostName(hostname, hostnamelen, &segid); - if ( res != FUNC_RETURN_OK ) - { - /* Try addresses of this machine. */ - for ( int i = 0 ; i < segstat->Info.HostAddrCount ; ++i ) - { - - elog(DEBUG5, "Resource manager checks host ip (%d)th to get segment.", i); - AddressString addr = NULL; - getSegInfoHostAddrStr(&(segstat->Info), i, &addr); - if ( strcmp(addr->Address, IPV4_DOT_ADDR_LO) == 0 && - segstat->Info.HostAddrCount > 1 ) - { - /* - * If the host has only one address as 127.0.0.1, we have to - * save it to track the only one segment, otherwise, we skip - * 127.0.0.1 address in resource pool. Because, each node has - * one entry of this address. - */ - continue; - } + res = getSegIDByHostName(hostname, hostnamelen, &segid); + if ( res != FUNC_RETURN_OK ) + { + /* Try addresses of this machine. */ + for ( int i = 0 ; i < segstat->Info.HostAddrCount ; ++i ) + { + elog(DEBUG5, "Resource manager checks host ip (%d)th to get segment.", i); + AddressString addr = NULL; + getSegInfoHostAddrStr(&(segstat->Info), i, &addr); + if ( strcmp(addr->Address, IPV4_DOT_ADDR_LO) == 0 && + segstat->Info.HostAddrCount > 1 ) + { + /* + * If the host has only one address as 127.0.0.1, we have to + * save it to track the only one segment, otherwise, we skip + * 127.0.0.1 address in resource pool. Because, each node has + * one entry of this address. + */ + continue; + } - res = getSegIDByHostAddr((uint8_t *)(addr->Address), addr->Length, &segid); - if ( res == FUNC_RETURN_OK ) - { - break; - } - } - } + res = getSegIDByHostAddr((uint8_t *)(addr->Address), addr->Length, &segid); + if ( res == FUNC_RETURN_OK ) + { + break; + } + } + } - /* CASE 1. It is a new host. */ + /* CASE 1. It is a new host. */ if ( res != FUNC_RETURN_OK ) { - uint8_t reportStatus = segstat->FTSAvailable; - /* Create machine information and corresponding resource information. */ segresource = createSegResource(segstat); @@ -815,7 +907,7 @@ int addHAWQSegWithSegStat(SegStat segstat, bool *capstatchanged) setSimpleStringRef(&hostnamekey, hostname, hostnamelen); oldval = setHASHTABLENode(&(PRESPOOL->SegmentHostNameIndexed), - TYPCONVERT(void *, &hostnamekey), + TYPCONVERT(void *, &hostnamekey), TYPCONVERT(void *, segid), false /* There should be no old value. */); if ( oldval != NULL ) @@ -844,22 +936,39 @@ int addHAWQSegWithSegStat(SegStat segstat, bool *capstatchanged) TYPCONVERT(void *, &hostaddrkey)) == NULL ) { setHASHTABLENode( &(PRESPOOL->SegmentHostAddrIndexed), - TYPCONVERT(void *, &hostaddrkey), + TYPCONVERT(void *, &hostaddrkey), TYPCONVERT(void *, segid), false /* There should be no old value. */); elog(LOG, "Resource manager tracked ip address '%.*s' for host '%s'", hostaddrkey.Len, hostaddrkey.Array, - hostname); + hostname); } } /* - * This is a new host registration. Normally the status is available, - * But if the number of failed temporary directory exceeds guc, + * If in GRM mode, the new registered segment is marked + * as DOWN, since master hasn't get a cluster report for it + * from global Resource Manager. + */ + if (DRMGlobalInstance->ImpType != NONE_HAWQ2) + { + segresource->Stat->StatusDesc |= SEG_STATUS_NO_GRM_NODE_REPORT; + } + + /* + * If there is any failed temporary directory, * this segment is considered as unavailable. */ - setSegResHAWQAvailability(segresource, reportStatus); + if (segresource->Stat->FailedTmpDirNum != 0) + { + segresource->Stat->StatusDesc |= SEG_STATUS_FAILED_TMPDIR; + } + + if (segresource->Stat->StatusDesc == 0) + { + setSegResHAWQAvailability(segresource, RESOURCE_SEG_STATUS_AVAILABLE); + } /* Add this node into the table gp_segment_configuration */ AddressString straddr = NULL; @@ -868,16 +977,26 @@ int addHAWQSegWithSegStat(SegStat segstat, bool *capstatchanged) if (Gp_role != GP_ROLE_UTILITY) { + SimpStringPtr description = build_segment_status_description(segresource->Stat); add_segment_config_row (segid+REGISTRATION_ORDER_OFFSET, hostname, straddr->Address, segresource->Stat->Info.port, SEGMENT_ROLE_PRIMARY, - segresource->Stat->FTSAvailable == RESOURCE_SEG_STATUS_AVAILABLE ? + IS_SEGSTAT_FTSAVAILABLE(segresource->Stat) ? SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN, - segresource->Stat->FailedTmpDirNum, - segresource->Stat->FailedTmpDirNum == 0 ? - "":GET_SEGINFO_FAILEDTMPDIR(&segresource->Stat->Info)); + (description->Len > 0)?description->Str:""); + + add_segment_history_row(segid+REGISTRATION_ORDER_OFFSET, + hostname, + IS_SEGSTAT_FTSAVAILABLE(segresource->Stat) ? + SEG_STATUS_DESCRIPTION_UP:description->Str); + + if (description != NULL) + { + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); + } } if (segresource->Stat->FTSAvailable == RESOURCE_SEG_STATUS_AVAILABLE) @@ -901,37 +1020,78 @@ int addHAWQSegWithSegStat(SegStat segstat, bool *capstatchanged) else { segresource = getSegResource(segid); Assert(segresource != NULL); + uint32_t oldStatusDesc = segresource->Stat->StatusDesc; uint8_t oldStatus = segresource->Stat->FTSAvailable; - bool statusChanged = oldStatus != segstat->FTSAvailable; /* - * Check if RM process is restarted in this segment. * If the latest reported RM process startup timestamp doesn't * match the previous, master RM consider segment's RM process * has restarted. * In rare case, the system's time is reset and segment's RM process * happen to get a same timestamp with previous one. */ - if (segresource->Stat->RMStartTimestamp != segstat->RMStartTimestamp) + if (segresource->Stat->RMStartTimestamp != segstat->RMStartTimestamp && + (segresource->Stat->StatusDesc & SEG_STATUS_HEARTBEAT_TIMEOUT) == 0 && + (segresource->Stat->StatusDesc & SEG_STATUS_COMMUNICATION_ERROR) == 0 && + (segresource->Stat->StatusDesc & SEG_STATUS_FAILED_PROBING_SEGMENT) == 0) { /* - * This segment's RM process has restarted, - * we should clean up old status, so mark it down. + * This segment's RM process has restarted. + * if StatusDesc doesn't have heartbeat timeout flag, or communication error, + * or failed probing segment flag, this segment is set to DOWN. + * It will be set to UP when reports a new heartbeat. */ - if (oldStatus == RESOURCE_SEG_STATUS_AVAILABLE && !statusChanged) - { - segstat->FTSAvailable = RESOURCE_SEG_STATUS_UNAVAILABLE; - statusChanged = true; - } + segresource->Stat->StatusDesc |= SEG_STATUS_RM_RESET; segresource->Stat->RMStartTimestamp = segstat->RMStartTimestamp; elog(LOG, "Master RM finds segment:%s 's RM process has restarted. " - "old status:%d, new status:%d", + "old status:%d", GET_SEGRESOURCE_HOSTNAME(segresource), - oldStatus, - segstat->FTSAvailable); + oldStatus); + } + else + { + if (segresource->Stat->RMStartTimestamp != segstat->RMStartTimestamp) + { + segresource->Stat->RMStartTimestamp = segstat->RMStartTimestamp; + } + /* + * Now clear heartbeat timeout flag, RM reset flag, + * no response flag, communication error flag + * and RM reset flag in StatusDesc + */ + if ((segresource->Stat->StatusDesc & SEG_STATUS_HEARTBEAT_TIMEOUT) != 0) + { + segresource->Stat->StatusDesc &= ~SEG_STATUS_HEARTBEAT_TIMEOUT; + elog(RMLOG, "Master RM gets heartbeat report from segment:%s, " + "clear its heartbeat timeout flag", + GET_SEGRESOURCE_HOSTNAME(segresource)); + } + if ((segresource->Stat->StatusDesc & SEG_STATUS_FAILED_PROBING_SEGMENT) != 0) + { + segresource->Stat->StatusDesc &= ~SEG_STATUS_FAILED_PROBING_SEGMENT; + elog(RMLOG, "Master RM gets heartbeat report from segment:%s, " + "clear its failed probing segment flag", + GET_SEGRESOURCE_HOSTNAME(segresource)); + } + if ((segresource->Stat->StatusDesc & SEG_STATUS_COMMUNICATION_ERROR) != 0) + { + segresource->Stat->StatusDesc &= ~SEG_STATUS_COMMUNICATION_ERROR; + elog(RMLOG, "Master RM gets heartbeat report from segment:%s, " + "clear its communication error flag", + GET_SEGRESOURCE_HOSTNAME(segresource)); + } + if ((segresource->Stat->StatusDesc & SEG_STATUS_RM_RESET) != 0) + { + segresource->Stat->StatusDesc &= ~SEG_STATUS_RM_RESET; + elog(RMLOG, "Master RM gets heartbeat report from segment:%s, " + "clear its RM reset flag", + GET_SEGRESOURCE_HOSTNAME(segresource)); + } } - /* Check if temporary directory path is changed */ + /* + * If temporary directory path is changed, update SegStatData + */ bool tmpDirChanged = false; if (segresource->Stat->FailedTmpDirNum != segstat->FailedTmpDirNum) { @@ -944,160 +1104,162 @@ int addHAWQSegWithSegStat(SegStat segstat, bool *capstatchanged) GET_SEGINFO_FAILEDTMPDIR(&segstat->Info)) != 0) { tmpDirChanged = true; - elog(LOG, "Resource manager finds segment %s(%d) 's " - "failed temporary directory is changed from " - "'%s' to '%s'", - GET_SEGRESOURCE_HOSTNAME(segresource), - segid, - GET_SEGINFO_FAILEDTMPDIR(&segresource->Stat->Info), - GET_SEGINFO_FAILEDTMPDIR(&segstat->Info)); } } - /* - * Either the FTSAvailable or the failed temporary directory - * of this segment is changed. - */ - if (statusChanged || tmpDirChanged) + if (tmpDirChanged) { - if (statusChanged && !tmpDirChanged) - { - if (Gp_role != GP_ROLE_UTILITY) - { - update_segment_status(segresource->Stat->ID + REGISTRATION_ORDER_OFFSET, - segstat->FTSAvailable == RESOURCE_SEG_STATUS_AVAILABLE ? - SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN); - } - - setSegResHAWQAvailability(segresource, segstat->FTSAvailable); - /* - * Segment is set from up to down, return resource. - */ - if (oldStatus == RESOURCE_SEG_STATUS_AVAILABLE) - { - *capstatchanged = true; - returnAllGRMResourceFromSegment(segresource); - } + elog(RMLOG, "Resource manager is going to set segment %s(%d) 's " + "failed temporary directory from " + "'%s' to '%s'", + GET_SEGRESOURCE_HOSTNAME(segresource), + segid, + segresource->Stat->FailedTmpDirNum == 0 ? + "" : GET_SEGINFO_FAILEDTMPDIR(&segresource->Stat->Info), + segstat->FailedTmpDirNum == 0 ? + "" : GET_SEGINFO_FAILEDTMPDIR(&segstat->Info)); + + int old = segresource->Stat->Info.FailedTmpDirLen == 0 ? + 0 :__SIZE_ALIGN64(segresource->Stat->Info.FailedTmpDirLen+1); + int new = segstat->Info.FailedTmpDirLen == 0 ? + 0 : __SIZE_ALIGN64(segstat->Info.FailedTmpDirLen+1); + + int current = segresource->Stat->Info.Size - + (segresource->Stat->Info.HostNameOffset + __SIZE_ALIGN64(segresource->Stat->Info.HostNameLen+1)); + if (segresource->Stat->Info.GRMHostNameLen != 0 && segresource->Stat->Info.GRMHostNameOffset != 0) + current -= __SIZE_ALIGN64(segresource->Stat->Info.GRMHostNameLen+1); + if (segresource->Stat->Info.GRMRackNameLen != 0 && segresource->Stat->Info.GRMRackNameOffset != 0) + current -= __SIZE_ALIGN64(segresource->Stat->Info.GRMRackNameLen+1); - elog(LOG, "Master resource manager sets segment %s(%d)'s status " - "to %c", - GET_SEGRESOURCE_HOSTNAME(segresource), - segid, - segstat->FTSAvailable == RESOURCE_SEG_STATUS_AVAILABLE ? - SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN); + /* + * repalloc memory if new size exceeds the old one. + * we don't shrink memory size if new size is less than the old one. + */ + if (new > old && current < new) + { + SegStat newSegStat = rm_repalloc(PCONTEXT, + segresource->Stat, + offsetof(SegStatData, Info) + + segresource->Stat->Info.Size + (new - old)); + segresource->Stat = newSegStat; + segresource->Stat->Info.Size += (new - old); } - else + + if (segresource->Stat->Info.FailedTmpDirOffset == 0) { - /* - * Failed temporary directory is changed, - * if the length of new failed temporary directory exceeds the old one, - * we need to repalloc SegInfoData - */ - elog(RMLOG, "Master resource manager is going to set segment %s(%d)'s " - "failed temporary directory from '%s' to '%s'", - GET_SEGRESOURCE_HOSTNAME(segresource), - segid, - segresource->Stat->FailedTmpDirNum == 0 ? - "" : GET_SEGINFO_FAILEDTMPDIR(&segresource->Stat->Info), - segstat->FailedTmpDirNum == 0 ? - "" : GET_SEGINFO_FAILEDTMPDIR(&segstat->Info)); - - int old = segresource->Stat->Info.FailedTmpDirLen == 0 ? - 0 :__SIZE_ALIGN64(segresource->Stat->Info.FailedTmpDirLen+1); - int new = segstat->Info.FailedTmpDirLen == 0 ? - 0 : __SIZE_ALIGN64(segstat->Info.FailedTmpDirLen+1); - - int current = segresource->Stat->Info.Size - - (segresource->Stat->Info.HostNameOffset + __SIZE_ALIGN64(segresource->Stat->Info.HostNameLen+1)); + Assert(segresource->Stat->FailedTmpDirNum == 0); + segresource->Stat->Info.FailedTmpDirOffset = segresource->Stat->Info.HostNameOffset + + __SIZE_ALIGN64(segresource->Stat->Info.HostNameLen+1); if (segresource->Stat->Info.GRMHostNameLen != 0 && segresource->Stat->Info.GRMHostNameOffset != 0) - current -= __SIZE_ALIGN64(segresource->Stat->Info.GRMHostNameLen+1); + segresource->Stat->Info.FailedTmpDirOffset += __SIZE_ALIGN64(segresource->Stat->Info.GRMHostNameLen+1); if (segresource->Stat->Info.GRMRackNameLen != 0 && segresource->Stat->Info.GRMRackNameOffset != 0) - current -= __SIZE_ALIGN64(segresource->Stat->Info.GRMRackNameLen+1); + segresource->Stat->Info.FailedTmpDirOffset += __SIZE_ALIGN64(segresource->Stat->Info.GRMRackNameLen+1); + } - /* - * repalloc memory if new size exceeds the old one. - * we don't shrink memory size if new size is less than the old one. - */ - if (new > old && current < new) - { - SegStat newSegStat = rm_repalloc(PCONTEXT, - segresource->Stat, - offsetof(SegStatData, Info) + - segresource->Stat->Info.Size + (new - old)); - segresource->Stat = newSegStat; - segresource->Stat->Info.Size += (new - old); - } + /* Clear old failed temporary directory string in SegInfoData */ + memset((char *)&segresource->Stat->Info + + segresource->Stat->Info.FailedTmpDirOffset, + 0, + segresource->Stat->Info.Size - + segresource->Stat->Info.FailedTmpDirOffset); - if (segresource->Stat->Info.FailedTmpDirOffset == 0) - { - Assert(segresource->Stat->FailedTmpDirNum == 0); - segresource->Stat->Info.FailedTmpDirOffset = segresource->Stat->Info.HostNameOffset + - __SIZE_ALIGN64(segresource->Stat->Info.HostNameLen+1); - if (segresource->Stat->Info.GRMHostNameLen != 0 && segresource->Stat->Info.GRMHostNameOffset != 0) - segresource->Stat->Info.FailedTmpDirOffset += __SIZE_ALIGN64(segresource->Stat->Info.GRMHostNameLen+1); - if (segresource->Stat->Info.GRMRackNameLen != 0 && segresource->Stat->Info.GRMRackNameOffset != 0) - segresource->Stat->Info.FailedTmpDirOffset += __SIZE_ALIGN64(segresource->Stat->Info.GRMRackNameLen+1); - } + if (segstat->FailedTmpDirNum != 0) + { + /* Update failed temporary directory string in SegInfoData */ + memcpy((char *)&segresource->Stat->Info + segresource->Stat->Info.FailedTmpDirOffset, + GET_SEGINFO_FAILEDTMPDIR(&segstat->Info), + strlen(GET_SEGINFO_FAILEDTMPDIR(&segstat->Info))); + } + else + { + segresource->Stat->Info.FailedTmpDirOffset = 0; + } + segresource->Stat->Info.FailedTmpDirLen = segstat->Info.FailedTmpDirLen; + segresource->Stat->FailedTmpDirNum = segstat->FailedTmpDirNum; - /* clear old failed temporary directory string in SegInfoData */ - memset((char *)&segresource->Stat->Info + - segresource->Stat->Info.FailedTmpDirOffset, - 0, - segresource->Stat->Info.Size - - segresource->Stat->Info.FailedTmpDirOffset); + elog(LOG, "Resource manager change segment %s(%d) 's " + "failed temporary directory is changed to '%s'", + GET_SEGRESOURCE_HOSTNAME(segresource), + segid, + segresource->Stat->FailedTmpDirNum == 0 ? + "" : GET_SEGINFO_FAILEDTMPDIR(&segresource->Stat->Info)); + + elog(RMLOG, "After resource manager " + "updates segment failed temporary directory, " + "GRM hostname:%s, GRM rackname:%s", + segresource->Stat->Info.GRMHostNameLen == 0 ? + "":GET_SEGINFO_GRMHOSTNAME(&(segresource->Stat->Info)), + segresource->Stat->Info.GRMRackNameLen == 0 ? + "":GET_SEGINFO_GRMRACKNAME(&(segresource->Stat->Info))); + } - if (segstat->FailedTmpDirNum != 0) - { - memcpy((char *)&segresource->Stat->Info + segresource->Stat->Info.FailedTmpDirOffset, - GET_SEGINFO_FAILEDTMPDIR(&segstat->Info), - strlen(GET_SEGINFO_FAILEDTMPDIR(&segstat->Info))); - } - else - { - segresource->Stat->Info.FailedTmpDirOffset = 0; - } - segresource->Stat->Info.FailedTmpDirLen = segstat->Info.FailedTmpDirLen; - segresource->Stat->FailedTmpDirNum = segstat->FailedTmpDirNum; - - elog(RMLOG, "After resource manager " - "updates segment failed temporary directory, " - "GRM hostname:%s, GRM rackname:%s", - segresource->Stat->Info.GRMHostNameLen == 0 ? - "":GET_SEGINFO_GRMHOSTNAME(&(segresource->Stat->Info)), - segresource->Stat->Info.GRMRackNameLen == 0 ? - "":GET_SEGINFO_GRMRACKNAME(&(segresource->Stat->Info))); - - setSegResHAWQAvailability(segresource, segstat->FTSAvailable); - if (Gp_role != GP_ROLE_UTILITY) - { - update_segment_failed_tmpdir(segresource->Stat->ID + REGISTRATION_ORDER_OFFSET, - segresource->Stat->FTSAvailable == RESOURCE_SEG_STATUS_AVAILABLE ? - SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN, - segresource->Stat->FailedTmpDirNum, - segresource->Stat->FailedTmpDirNum == 0 ? - "" : GET_SEGINFO_FAILEDTMPDIR(&segresource->Stat->Info)); - } + /* Set or clear failed temporary directory flag */ + if (segresource->Stat->FailedTmpDirNum != 0) + { + segresource->Stat->StatusDesc |= SEG_STATUS_FAILED_TMPDIR; + } + else + { + if ((segresource->Stat->StatusDesc & SEG_STATUS_FAILED_TMPDIR) != 0) + segresource->Stat->StatusDesc &= ~SEG_STATUS_FAILED_TMPDIR; + } + + if (segresource->Stat->StatusDesc == 0) + { + if (oldStatus == RESOURCE_SEG_STATUS_UNAVAILABLE || + oldStatus == RESOURCE_SEG_STATUS_UNSET) + setSegResHAWQAvailability(segresource, RESOURCE_SEG_STATUS_AVAILABLE); + } + else + { + if (oldStatus == RESOURCE_SEG_STATUS_AVAILABLE) + setSegResHAWQAvailability(segresource, RESOURCE_SEG_STATUS_UNAVAILABLE); + } - if (statusChanged) + if (oldStatusDesc != segresource->Stat->StatusDesc || + tmpDirChanged) + { + /* + * StatusDesc or failed temporary directory path is changed, + * update the gp_segment_configuration table and insert a row + * into gp_configuration_history table + */ + if (Gp_role != GP_ROLE_UTILITY) + { + SimpStringPtr description = build_segment_status_description(segresource->Stat); + update_segment_status(segresource->Stat->ID + REGISTRATION_ORDER_OFFSET, + IS_SEGSTAT_FTSAVAILABLE(segresource->Stat) ? + SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + + elog(LOG, "Resource manager update node(%s) information with heartbeat report," + "status:'%c', description:%s", + GET_SEGRESOURCE_HOSTNAME(segresource), + IS_SEGSTAT_FTSAVAILABLE(segresource->Stat) ? + SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + + add_segment_history_row(segresource->Stat->ID + REGISTRATION_ORDER_OFFSET, + GET_SEGRESOURCE_HOSTNAME(segresource), + IS_SEGSTAT_FTSAVAILABLE(segresource->Stat) ? + SEG_STATUS_DESCRIPTION_UP:description->Str); + if (description != NULL) { - *capstatchanged = true; - /* - * Segment is set from up to down, return resource. - */ - if (oldStatus == RESOURCE_SEG_STATUS_AVAILABLE) - { - returnAllGRMResourceFromSegment(segresource); - } + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); } + } + } - elog(LOG, "Master resource manager sets segment %s(%d)'s " - "failed temporary directory to '%s', status:%c", - GET_SEGRESOURCE_HOSTNAME(segresource), - segid, - segresource->Stat->FailedTmpDirNum == 0 ? - "" : GET_SEGINFO_FAILEDTMPDIR(&segresource->Stat->Info), - segresource->Stat->FTSAvailable == RESOURCE_SEG_STATUS_AVAILABLE ? - SEGMENT_STATUS_UP : SEGMENT_STATUS_DOWN); + if (oldStatus != segresource->Stat->FTSAvailable) + { + *capstatchanged = true; + /* + * Segment is set from up to down, return resource. + */ + if (oldStatus == RESOURCE_SEG_STATUS_AVAILABLE) + { + returnAllGRMResourceFromSegment(segresource); } } @@ -1161,6 +1323,7 @@ int updateHAWQSegWithGRMSegStat( SegStat segstat) SegResource segres = NULL; SegStat newSegStat = NULL; int32_t segid = SEGSTAT_ID_INVALID; + bool statusDescChange = false; /* Anyway, the host GRM capacity is updated here if the cluster level * capacity is fixed. @@ -1305,10 +1468,8 @@ int updateHAWQSegWithGRMSegStat( SegStat segstat) elog(RMLOG, "After resource manager " "updates segment info's GRM host name and rack name, " "failed temporary directory: %s", - segres->Stat->FailedTmpDirNum == 0 ? "":GET_SEGINFO_FAILEDTMPDIR(&(segres->Stat->Info))); - - /* Always set segment global resource manager available. */ - setSegResGLOBAvailability(segres, RESOURCE_SEG_STATUS_AVAILABLE); + segres->Stat->FailedTmpDirNum == 0 ? + "":GET_SEGINFO_FAILEDTMPDIR(&(segres->Stat->Info))); if ( segres->Stat->GRMTotalMemoryMB != segstat->GRMTotalMemoryMB || segres->Stat->GRMTotalCore != segstat->GRMTotalCore ) @@ -1320,14 +1481,20 @@ int updateHAWQSegWithGRMSegStat( SegStat segstat) segres->Stat->GRMTotalMemoryMB = segstat->GRMTotalMemoryMB; segres->Stat->GRMTotalCore = segstat->GRMTotalCore; - /* Update overall cluster resource capacity. */ - minusResourceBundleData(&(PRESPOOL->GRMTotal), oldgrmmem, oldgrmcore); - addResourceBundleData(&(PRESPOOL->GRMTotal), - segres->Stat->GRMTotalMemoryMB, - segres->Stat->GRMTotalCore); + /* + * If this segment is FTS available, then + * update overall cluster resource capacity. + */ + if (IS_SEGSTAT_FTSAVAILABLE(segres->Stat)) + { + minusResourceBundleData(&(PRESPOOL->GRMTotal), oldgrmmem, oldgrmcore); + addResourceBundleData(&(PRESPOOL->GRMTotal), + segres->Stat->GRMTotalMemoryMB, + segres->Stat->GRMTotalCore); + } elog(LOG, "Resource manager finds host %s capacity changed from " - "GRM (%d MB, %d CORE) to GRM (%d MB, %d CORE)", + "GRM (%d MB, %d CORE) to GRM (%d MB, %d CORE)", GET_SEGRESOURCE_HOSTNAME(segres), oldgrmmem, oldgrmcore, @@ -1335,6 +1502,49 @@ int updateHAWQSegWithGRMSegStat( SegStat segstat) segres->Stat->GRMTotalCore); } + segres->Stat->GRMHandled = true; + /* Clear no GRM node report flag. */ + if ((segres->Stat->StatusDesc & SEG_STATUS_NO_GRM_NODE_REPORT) != 0) + { + segres->Stat->StatusDesc &= ~SEG_STATUS_NO_GRM_NODE_REPORT; + statusDescChange = true; + } + + /* + * If get GRM node report, and there is no other flag, + * mark this segment to UP + */ + if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) && segres->Stat->StatusDesc == 0) + { + Assert(statusDescChange == true); + setSegResHAWQAvailability(segres, RESOURCE_SEG_STATUS_AVAILABLE); + } + + if (statusDescChange && Gp_role != GP_ROLE_UTILITY) + { + SimpStringPtr description = build_segment_status_description(segres->Stat); + update_segment_status(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ? + SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + add_segment_history_row(segres->Stat->ID + REGISTRATION_ORDER_OFFSET, + GET_SEGRESOURCE_HOSTNAME(segres), + IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ? + SEG_STATUS_DESCRIPTION_UP:description->Str); + + elog(LOG, "Resource manager update node(%s) information with yarn node report," + "status:'%c', description:%s", + GET_SEGRESOURCE_HOSTNAME(segres), + IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ? + SEGMENT_STATUS_UP:SEGMENT_STATUS_DOWN, + (description->Len > 0)?description->Str:""); + if (description != NULL) + { + freeSimpleStringContent(description); + rm_pfree(PCONTEXT, description); + } + } + int32_t curratio = 0; if (DRMGlobalInstance->ImpType == YARN_LIBYARN && segres->Stat->GRMTotalMemoryMB > 0 && @@ -1347,7 +1557,7 @@ int updateHAWQSegWithGRMSegStat( SegStat segstat) return FUNC_RETURN_OK; } -void setAllSegResourceGRMUnavailable(void) +void setAllSegResourceGRMUnhandled(void) { List *allsegres = NULL; ListCell *cell = NULL; @@ -1356,7 +1566,7 @@ void setAllSegResourceGRMUnavailable(void) foreach(cell, allsegres) { SegResource segres = (SegResource)(((PAIR)lfirst(cell))->Value); - setSegResGLOBAvailability(segres, RESOURCE_SEG_STATUS_UNAVAILABLE); + segres->Stat->GRMHandled = false; } freePAIRRefList(&(PRESPOOL->Segments), &allsegres); } @@ -1432,7 +1642,6 @@ SegResource createSegResource(SegStat segstat) res->LastUpdateTime = gettime_microsec(); res->RUAlivePending = false; res->Stat->FTSAvailable = RESOURCE_SEG_STATUS_UNSET; - res->Stat->GRMAvailable = RESOURCE_SEG_STATUS_UNSET; for ( int i = 0 ; i < RESOURCE_QUEUE_RATIO_SIZE ; ++i ) { @@ -1455,13 +1664,6 @@ int setSegStatHAWQAvailability( SegStat segstat, uint8_t newstatus) return res; } -int setSegStatGLOBAvailability( SegStat segstat, uint8_t newstatus) -{ - int res = segstat->GRMAvailable; - segstat->GRMAvailable = newstatus; - return res; -} - /* Set hawq status of host, return the old status */ int setSegResHAWQAvailability( SegResource segres, uint8_t newstatus) { @@ -1499,12 +1701,7 @@ int setSegResHAWQAvailability( SegResource segres, uint8_t newstatus) segres->Stat->GRMTotalCore); addNewResourceToResourceManagerByBundle(&(segres->Allocated)); - if ( (DRMGlobalInstance->ImpType == NONE_HAWQ2) || - (DRMGlobalInstance->ImpType != NONE_HAWQ2 && - IS_SEGSTAT_GRMAVAILABLE(segres->Stat))) - { - PRESPOOL->AvailNodeCount++; - } + PRESPOOL->AvailNodeCount++; } else { @@ -1528,12 +1725,6 @@ int setSegResHAWQAvailability( SegResource segres, uint8_t newstatus) return res; } -int setSegResGLOBAvailability( SegResource segres, uint8_t newstatus) -{ - int res = setSegStatGLOBAvailability(segres->Stat, newstatus); - return res; -} - /* Generate HAWQ host report. */ void generateSegResourceReport(int32_t segid, SelfMaintainBuffer buff) { @@ -1549,15 +1740,14 @@ void generateSegResourceReport(int32_t segid, SelfMaintainBuffer buff) static char reporthead[256]; int headsize = sprintf(reporthead, - "SEGMENT:ID=%d, " - "HAWQAVAIL=%d,GLOBAVAIL=%d. " - "FTS( %d MB, %d CORE). " + "SEGMENT:ID=%d, " + "HAWQAVAIL=%d. " + "FTS( %d MB, %d CORE). " "GRM( %d MB, %d CORE). " "MEM=%d(%d) MB. " "CORE=%lf(%lf).\n", seg->Stat->ID, seg->Stat->FTSAvailable, - seg->Stat->GRMAvailable, seg->Stat->FTSTotalMemoryMB, seg->Stat->FTSTotalCore, seg->Stat->GRMTotalMemoryMB, @@ -1565,7 +1755,7 @@ void generateSegResourceReport(int32_t segid, SelfMaintainBuffer buff) seg->Allocated.MemoryMB, seg->Available.MemoryMB, seg->Allocated.Core, - seg->Available.Core); + seg->Available.Core); appendSelfMaintainBuffer(buff, reporthead, headsize); generateSegInfoReport(&(seg->Stat->Info), buff); } @@ -1655,13 +1845,12 @@ void generateSegStatReport(SegStat segstat, SelfMaintainBuffer buff) static char reporthead[256]; int reportheadlen = 0; reportheadlen = sprintf(reporthead, - "NODE:ID=%d,HAWQ %s, GRM %s, " - "HAWQ CAP (%d MB, %lf CORE), " + "NODE:ID=%d,HAWQ %s, " + "HAWQ CAP (%d MB, %lf CORE), " "GRM CAP(%d MB, %lf CORE),", - segstat->ID, + segstat->ID, segstat->FTSAvailable ? "AVAIL" : "UNAVAIL", - segstat->GRMAvailable ? "AVAIL" : "UNAVAIL", - segstat->FTSTotalMemoryMB, + segstat->FTSTotalMemoryMB, segstat->FTSTotalCore * 1.0, segstat->GRMTotalMemoryMB, segstat->GRMTotalCore * 1.0); @@ -3004,9 +3193,9 @@ void returnAllGRMResourceFromSegment(SegResource segres) /* * Go through each segment, and return the GRM containers in those segments - * global resource manager unavailable. + * with GRM unavailable or FTS unavailable. */ -void returnAllGRMResourceFromGRMUnavailableSegments(void) +void returnAllGRMResourceFromUnavailableSegments(void) { List *allsegres = NULL; ListCell *cell = NULL; @@ -3016,7 +3205,7 @@ void returnAllGRMResourceFromGRMUnavailableSegments(void) foreach(cell, allsegres) { SegResource segres = (SegResource)(((PAIR)lfirst(cell))->Value); - if ( IS_SEGSTAT_GRMAVAILABLE(segres->Stat) ) + if (IS_SEGSTAT_FTSAVAILABLE(segres->Stat)) { continue; } @@ -4208,8 +4397,7 @@ void refreshAvailableNodeCount(void) SegResource segres = (SegResource)(pair->Value); Assert( segres != NULL ); - if ( IS_SEGSTAT_FTSAVAILABLE(segres->Stat) && - IS_SEGSTAT_GRMAVAILABLE(segres->Stat) ) + if ( IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ) { PRESPOOL->AvailNodeCount++; } @@ -4335,10 +4523,12 @@ void fixClusterMemoryCoreRatio(void) } else { - if ( !IS_SEGSTAT_GRMAVAILABLE(segres->Stat) ) - { - continue; - } + /* + * If this segment is FTS available, + * RM should get GRM report for it. + */ + Assert((segres->Stat->StatusDesc & SEG_STATUS_NO_GRM_NODE_REPORT) + == 0); memorymb = segres->Stat->GRMTotalMemoryMB; core = segres->Stat->GRMTotalCore; } @@ -4389,10 +4579,12 @@ void fixClusterMemoryCoreRatio(void) } else { - if ( !IS_SEGSTAT_GRMAVAILABLE(segres->Stat) ) - { - continue; - } + /* + * If this segment is FTS available, + * RM should get GRM report for it. + */ + Assert((segres->Stat->StatusDesc & SEG_STATUS_NO_GRM_NODE_REPORT) + == 0); memorymb = segres->Stat->GRMTotalMemoryMB; core = segres->Stat->GRMTotalCore; } @@ -4563,8 +4755,7 @@ void adjustSegmentCapacityForGRM(SegResource segres) adjustMemoryCoreValue(&(segres->Stat->GRMTotalMemoryMB), &(segres->Stat->GRMTotalCore)); - if ( !IS_SEGSTAT_FTSAVAILABLE(segres->Stat) || - !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) + if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat)) { return; } @@ -4628,9 +4819,8 @@ void dumpResourcePoolHosts(const char *filename) segresource->Stat->FTSTotalCore, segresource->Stat->GRMTotalMemoryMB, segresource->Stat->GRMTotalCore); - fprintf(fp, "HOST_AVAILABLITY(HAWQAvailable=%s:GLOBAvailable=%s)\n", - segresource->Stat->FTSAvailable == 0 ? "false" : "true", - segresource->Stat->GRMAvailable == 0 ? "false" : "true"); + fprintf(fp, "HOST_AVAILABLITY(HAWQAvailable=%s)\n", + segresource->Stat->FTSAvailable == 0 ? "false" : "true"); fprintf(fp, "HOST_RESOURCE(AllocatedMemory=%d:AllocatedCores=%f:" "AvailableMemory=%d:AvailableCores=%f:" "IOBytesWorkload="INT64_FORMAT":" @@ -4686,3 +4876,66 @@ void dumpResourcePoolHosts(const char *filename) fclose(fp); } + +static const char* SegStatusDesc[] = { + "heartbeat timeout", + "failed probing segment", + "communication error", + "failed temporary directory", + "resource manager process was reset", + "no global node report" +}; + +/* + * Build a string of status description for SegStat. + * This string contains the reason why this segment is DOWN. + * e.g. the failed temporary directories, heartbeat timeout. + */ +SimpStringPtr build_segment_status_description(SegStat segstat) +{ + SimpStringPtr description = createSimpleString(PCONTEXT); + SelfMaintainBufferData buf; + initializeSelfMaintainBuffer(&buf, PCONTEXT); + uint32_t tmp,cnt; + + if (segstat->StatusDesc == 0) + goto _exit; + + /* Count the number of flags */ + cnt = tmp = segstat->StatusDesc; + while (tmp != 0) + { + tmp = tmp >> 1; + cnt -= tmp; + } + + for (int idx = 0, tmp = 0; idx < sizeof(SegStatusDesc)/sizeof(char*); idx++) + { + if ((segstat->StatusDesc & 1<Info), + strlen(GET_SEGINFO_FAILEDTMPDIR(&segstat->Info))); + } + if (tmp != cnt) + appendSelfMaintainBuffer(&buf, ";", 1); + } + } + + if (buf.Size > 0) + { + static char zeropad = '\0'; + appendSMBVar(&buf, zeropad); + setSimpleStringNoLen(description, buf.Buffer); + destroySelfMaintainBuffer(&buf); + } + +_exit: + return description; +} + http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/resourcemanager/resqueuemanager.c ---------------------------------------------------------------------- diff --git a/src/backend/resourcemanager/resqueuemanager.c b/src/backend/resourcemanager/resqueuemanager.c index 00de72d..9af0d7c 100644 --- a/src/backend/resourcemanager/resqueuemanager.c +++ b/src/backend/resourcemanager/resqueuemanager.c @@ -2631,8 +2631,7 @@ void refreshActualMinGRMContainerPerSeg(void) foreach(cell, allsegres) { SegResource segres = (SegResource)(((PAIR)lfirst(cell))->Value); - if ( !IS_SEGSTAT_FTSAVAILABLE(segres->Stat) || - !IS_SEGSTAT_GRMAVAILABLE(segres->Stat) ) + if (IS_SEGSTAT_FTSAVAILABLE(segres->Stat)) { continue; } http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/backend/utils/gp/segadmin.c ---------------------------------------------------------------------- diff --git a/src/backend/utils/gp/segadmin.c b/src/backend/utils/gp/segadmin.c index 6b49926..775cadb 100644 --- a/src/backend/utils/gp/segadmin.c +++ b/src/backend/utils/gp/segadmin.c @@ -277,8 +277,7 @@ gp_add_master_standby(PG_FUNCTION_ARGS) values[Anum_gp_segment_configuration_port - 1] = Int32GetDatum(master->port); values[Anum_gp_segment_configuration_hostname - 1] = PG_GETARG_DATUM(0); values[Anum_gp_segment_configuration_address - 1] = PG_GETARG_DATUM(1); - nulls[Anum_gp_segment_configuration_failed_tmpdir_num - 1] = true; - nulls[Anum_gp_segment_configuration_failed_tmpdir - 1] = true; + nulls[Anum_gp_segment_configuration_description - 1] = true; tuple = caql_form_tuple(pcqCtx, values, nulls); http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/include/catalog/gp_configuration.h ---------------------------------------------------------------------- diff --git a/src/include/catalog/gp_configuration.h b/src/include/catalog/gp_configuration.h index eaa0893..ee127e6 100644 --- a/src/include/catalog/gp_configuration.h +++ b/src/include/catalog/gp_configuration.h @@ -135,9 +135,10 @@ typedef FormData_gp_configuration *Form_gp_configuration; CREATE TABLE gp_configuration_history with (camelcase=GpConfigHistory, shared=true, oid=false, relid=5006, reltype_oid=6434, content=MASTER_ONLY) ( - time timestamp with time zone, - dbid smallint, - "desc" text + time timestamp with time zone, + registration_order integer, + hostname text, + description text ); TIDYCAT_ENDDEF @@ -146,8 +147,8 @@ typedef FormData_gp_configuration *Form_gp_configuration; /* TIDYCAT_BEGIN_CODEGEN WARNING: DO NOT MODIFY THE FOLLOWING SECTION: - Generated by tidycat.pl version 21. - on Wed Nov 24 14:13:16 2010 + Generated by tidycat.pl version 34 + on Fri Feb 26 10:43:15 2016 */ @@ -182,9 +183,10 @@ typedef FormData_gp_configuration *Form_gp_configuration; CATALOG(gp_configuration_history,5006) BKI_SHARED_RELATION BKI_WITHOUT_OIDS { - timestamptz time; - int2 dbid; - text desc; + timestamptz time; + int4 registration_order; + text hostname; + text description; } FormData_gp_configuration_history; #undef timestamptz @@ -202,10 +204,11 @@ typedef FormData_gp_configuration_history *Form_gp_configuration_history; * compiler constants for gp_configuration_history * ---------------- */ -#define Natts_gp_configuration_history 3 -#define Anum_gp_configuration_history_time 1 -#define Anum_gp_configuration_history_dbid 2 -#define Anum_gp_configuration_history_desc 3 +#define Natts_gp_configuration_history 4 +#define Anum_gp_configuration_history_time 1 +#define Anum_gp_configuration_history_registration_order 2 +#define Anum_gp_configuration_history_hostname 3 +#define Anum_gp_configuration_history_description 4 /* TIDYCAT_END_CODEGEN */ http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/include/catalog/gp_segment_config.h ---------------------------------------------------------------------- diff --git a/src/include/catalog/gp_segment_config.h b/src/include/catalog/gp_segment_config.h index f9f2def..fa577f1 100644 --- a/src/include/catalog/gp_segment_config.h +++ b/src/include/catalog/gp_segment_config.h @@ -53,8 +53,7 @@ port integer , hostname text , address text , - failed_tmpdir_num integer , - failed_tmpdir text + description text ); create unique index on gp_segment_configuration(registration_order) with (indexid=6106, indexname=gp_segment_config_registration_order_index); @@ -95,8 +94,7 @@ CATALOG(gp_segment_configuration,5036) BKI_SHARED_RELATION BKI_WITHOUT_OIDS int4 port; text hostname; text address; - int4 failed_tmpdir_num; - text failed_tmpdir; + text description; } FormData_gp_segment_configuration; @@ -112,15 +110,14 @@ typedef FormData_gp_segment_configuration *Form_gp_segment_configuration; * compiler constants for gp_segment_configuration * ---------------- */ -#define Natts_gp_segment_configuration 8 +#define Natts_gp_segment_configuration 7 #define Anum_gp_segment_configuration_registration_order 1 #define Anum_gp_segment_configuration_role 2 #define Anum_gp_segment_configuration_status 3 #define Anum_gp_segment_configuration_port 4 #define Anum_gp_segment_configuration_hostname 5 #define Anum_gp_segment_configuration_address 6 -#define Anum_gp_segment_configuration_failed_tmpdir_num 7 -#define Anum_gp_segment_configuration_failed_tmpdir 8 +#define Anum_gp_segment_configuration_description 7 /* TIDYCAT_END_CODEGEN */ http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/include/catalog/pg_proc.h ---------------------------------------------------------------------- diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index caf22a9..a6f8970 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -10196,7 +10196,6 @@ DESCR("Convert a cloned master catalog for use as a segment"); DATA(insert OID = 5053 ( gp_activate_standby PGNSP PGUID 12 f f f f v 0 16 f "" _null_ _null_ _null_ gp_activate_standby - _null_ n )); DESCR("Activate a standby"); - /* We cheat in the following two functions: they are technically volatile but */ /* we can only dispatch them if they're immutable :(. */ /* gp_add_segment_persistent_entries(int2, int2, _text) => bool */ @@ -10285,6 +10284,10 @@ DESCR("Remove an entry from gp_relation_node"); DATA(insert OID = 5074 ( gp_persistent_relation_node_check PGNSP PGUID 12 f f f t v 0 6990 f "" _null_ _null_ _null_ gp_persistent_relation_node_check - _null_ n )); DESCR("physical filesystem information"); +/* gp_remove_segment_history() => bool */ +DATA(insert OID = 5075 ( gp_remove_segment_history PGNSP PGUID 12 f f f f v 0 16 f "" _null_ _null_ _null_ gp_remove_segment_history - _null_ n )); +DESCR("Remove all entries from the gp_configuration_history"); + /* cosh(float8) => float8 */ DATA(insert OID = 3539 ( cosh PGNSP PGUID 12 f f f f i 1 701 f "701" _null_ _null_ _null_ dcosh - _null_ n )); DESCR("Hyperbolic cosine function"); http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/src/include/utils/builtins.h ---------------------------------------------------------------------- diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 48de877..55ff100 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -1120,6 +1120,7 @@ extern Datum gp_remove_master_standby(PG_FUNCTION_ARGS); extern Datum gp_activate_standby(PG_FUNCTION_ARGS); extern Datum gp_add_segment_mirror(PG_FUNCTION_ARGS); +extern Datum gp_remove_segment_history(PG_FUNCTION_ARGS); extern Datum gp_remove_segment_mirror(PG_FUNCTION_ARGS); extern Datum gp_add_segment(PG_FUNCTION_ARGS); extern Datum gp_remove_segment(PG_FUNCTION_ARGS); http://git-wip-us.apache.org/repos/asf/incubator-hawq/blob/d1780628/tools/bin/gppylib/data/2.0.json ---------------------------------------------------------------------- diff --git a/tools/bin/gppylib/data/2.0.json b/tools/bin/gppylib/data/2.0.json index c65e12d..019c378 100644 --- a/tools/bin/gppylib/data/2.0.json +++ b/tools/bin/gppylib/data/2.0.json @@ -109,8 +109,9 @@ "CamelCaseRelationId" : "GpConfigHistoryRelationId", "UppercaseReltypeOid" : "GP_CONFIGURATION_HISTORY_RELTYPE_OID", "colh" : { - "dbid" : "int2", - "desc" : "text", + "description" : "text", + "hostname" : "text", + "registration_order" : "int4", "time" : "timestamptz" }, "cols" : [ @@ -121,19 +122,24 @@ "sqltype" : "timestamp_with_time_zone" }, { - "colname" : "dbid", - "ctype" : "int2", - "sqltype" : "smallint" + "colname" : "registration_order", + "ctype" : "int4", + "sqltype" : "integer" + }, + { + "colname" : "hostname", + "ctype" : "text", + "sqltype" : "text" }, { - "colname" : "desc", + "colname" : "description", "ctype" : "text", "sqltype" : "text" } ], "filename" : "gp_configuration.h", "relid_comment_tag" : "/* relation id: 5006 - gp_configuration_history */\n", - "tabdef_text" : "\n CREATE TABLE gp_configuration_history\n with (camelcase=GpConfigHistory, shared=true, oid=false, relid=5006, reltype_oid=6434, content=MASTER_ONLY)\n (\n time timestamp with time zone,\n dbid smallint,\n \"desc\" text\n )", + "tabdef_text" : "\n CREATE TABLE gp_configuration_history\n with (camelcase=GpConfigHistory, shared=true, oid=false, relid=5006, reltype_oid=6434, content=MASTER_ONLY)\n (\n time timestamp with time zone,\n registration_order integer,\n hostname text,\n description text\n )", "tzhack" : "\"time\"", "with" : { "bootstrap" : 0, @@ -868,8 +874,7 @@ "UppercaseToastReltypeOid" : "GP_SEGMENT_CONFIGURATION_TOAST_RELTYPE_OID", "colh" : { "address" : "text", - "failed_tmpdir" : "text", - "failed_tmpdir_num" : "int4", + "description" : "text", "hostname" : "text", "port" : "int4", "registration_order" : "int4", @@ -909,12 +914,7 @@ "sqltype" : "text" }, { - "colname" : "failed_tmpdir_num", - "ctype" : "int4", - "sqltype" : "integer" - }, - { - "colname" : "failed_tmpdir", + "colname" : "description", "ctype" : "text", "sqltype" : "text" } @@ -955,7 +955,7 @@ } ], "relid_comment_tag" : "/* relation id: 5036 - gp_segment_configuration */\n", - "tabdef_text" : "\n CREATE TABLE gp_segment_configuration\n with (camelcase=GpSegmentConfig, shared=true, oid=false, relid=5036, reltype_oid=6442, toast_oid=2900, toast_index=2901, toast_reltype=2906, content=MASTER_ONLY)\n (\n registration_order integer ,\n role \"char\" ,\n status \"char\" ,\n port integer ,\n hostname text ,\n address text ,\n failed_tmpdir_num integer ,\n failed_tmpdir text\n )", + "tabdef_text" : "\n CREATE TABLE gp_segment_configuration\n with (camelcase=GpSegmentConfig, shared=true, oid=false, relid=5036, reltype_oid=6442, toast_oid=2900, toast_index=2901, toast_reltype=2906, content=MASTER_ONLY)\n (\n registration_order integer ,\n role \"char\" ,\n status \"char\" ,\n port integer ,\n hostname text ,\n address text ,\n description text\n )", "with" : { "bootstrap" : 0, "camelcase" : "GpSegmentConfig",