From issues-return-82199-archive-asf-public=cust-asf.ponee.io@ambari.apache.org Thu Feb 18 08:51:02 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 7CB7718062C for ; Thu, 18 Feb 2021 09:51:02 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id C0EE844FA9 for ; Thu, 18 Feb 2021 08:51:01 +0000 (UTC) Received: (qmail 32772 invoked by uid 500); 18 Feb 2021 08:51:01 -0000 Mailing-List: contact issues-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list issues@ambari.apache.org Received: (qmail 32738 invoked by uid 99); 18 Feb 2021 08:51:01 -0000 Received: from mailrelay1-he-de.apache.org (HELO mailrelay1-he-de.apache.org) (116.203.21.61) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Feb 2021 08:51:01 +0000 Received: from jira2-he-de.apache.org (jira2-he-de.apache.org [168.119.33.54]) by mailrelay1-he-de.apache.org (ASF Mail Server at mailrelay1-he-de.apache.org) with ESMTPS id 829BD3E8AA for ; Thu, 18 Feb 2021 08:51:00 +0000 (UTC) Received: from jira2-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira2-he-de.apache.org (ASF Mail Server at jira2-he-de.apache.org) with ESMTP id 5F465C80130 for ; Thu, 18 Feb 2021 08:51:00 +0000 (UTC) Date: Thu, 18 Feb 2021 08:51:00 +0000 (UTC) From: "Tamas Payer (Jira)" To: issues@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (AMBARI-25569) Reassess Ambari Metrics data migration MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMBARI-25569?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Payer resolved AMBARI-25569. ---------------------------------- Resolution: Fixed > Reassess Ambari Metrics data migration > -------------------------------------- > > Key: AMBARI-25569 > URL: https://issues.apache.org/jira/browse/AMBARI-25569 > Project: Ambari > Issue Type: Task > Components: ambari-metrics > Affects Versions: 2.7.3, 2.7.4, 2.7.5 > Reporter: Tamas Payer > Assignee: Tamas Payer > Priority: Major > Labels: metric-collector, migration, pull-request-available > Fix For: 2.7.6 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The data migration process of Ambari Metrics as described at=C2=A0[https:= //docs.cloudera.com/HDPDocuments/Ambari-2.7.5.0/bk_ambari-upgrade-major/con= tent/upgrading_HDP_post_upgrade_tasks.html] > is causing issues, like not migrating data that would be expected by the = user. (e.g. Yarn Queue metrics other than the root queue's.) > The data migration is usually called by the > =C2=A0 > {code:java} > /usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector= /conf/ upgrade_start /etc/ambari-metrics-collector/conf/metrics_whitelist "= 31556952000" > {code} > command where the whitelist is specified. > The migration code only looks for the metrics that are present in the whi= telist file. This is true even in the case when the AMS Whitelisting is not= enabled. The user will only have those metrics migrated that are present i= n the whitelist file, which is usually not all that are required. > =C2=A0 > I suggest the following change: > - If whitelist file parameter *is provided* then > ** migrate only the metrics that are in the whitelist file > - if *--allmetrics* value is provided in place of whitelist file paramet= er then > *=20 > ** migrate all metrics regardless of other configuration settings > - if whitelist file parameter is *not=C2=A0provided*=C2=A0( and the time= period for data migration is also not provided) then > ** if whitelisting is *enabled* then > *** discover the whitelist file configured in AMS and=C2=A0migrate only = the metrics that are in the whitelist file > ** if=C2=A0whitelisting is *disabled* then > *** migrate *all the metrics* present in the database > *Examples:* > * {{*Migrate the metrics present in the whitelist file that are not olde= r than one year (365 days)*}} > /usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector= /conf/ upgrade_start /etc/ambari-metrics-collector/conf/metrics_whitelist "= 365" > * {{*Migrate the metrics present in the whitelist file that are not olde= r than the default one month (30 days)*}} > {{/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collect= or/conf/ upgrade_start /etc/ambari-metrics-collector/conf/metrics_whitelist= }} > * {{*Migrate all metrics that are not older than one year (365 days)*}} > {{/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collec= tor/conf/ upgrade_start --allmetrics "365"}} > * {{*Migrate all metrics*}}=C2=A0*that are not older than the default on= e month (30 days)* > {{/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collect= or/conf/ upgrade_start --allmetrics}} > * *If whitelisting is enabled then migrate the metrics present in the wh= itelist file configured in Ambari that are not older than the default one m= onth (30 days). If=C2=A0whitelisting is disabled M**igrate all metrics that= are not older than the default one month.* > /usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector= /conf/ upgrade_start > =C2=A0 > *1. Introduce an '--allmetrics' to enforce migration of all metrics regar= dless of other settings.* > Due to the suboptimal argument handling, if one wants to define an argum= ent that comes after the 'whitelist file' > argument - like the 'starttime' - the 'whitelist file' argument must be = defined. > But when we don't want to use the whitelist data because we need to migr= ate all the metrics the '--allmetrics' argument can be provided instead of = 'whitelist file'. > Example: migrate all the metrics from the last year > {{/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collec= tor/conf/ upgrade_start --allmetrics "365"}} > *2. The start time handling should be fixed and changed* > * The code is intended to migrate data from the "last x milliseconds" as= the handling of the default data shows where the startTime is subtracted f= rom the current timestamp. > {{public static final long DEFAULT_START_TIME =3D System.currentTimeMill= is() - ONE_MONTH_MILLIS; //Last month}} > But when the user externally provided the=C2=A0{{startTime}}=C2=A0value = it was not subtracted from the current timestamp, but was used as it is, wh= ich is indeed erroneous. > * Also, I suggest using days instead of milliseconds to define the requi= red migration time window, because it is a more realistic and convenient gr= anularity. Like in the above example the command will migrate data from the= last 365 days. > *3.=C2=A0Furthermore, the migration process frequently dies silently whil= e saving the metadata.* > The log message "Saving metadata to store..." is present in the logs but = the=C2=A0"Metadata was saved." is mostly never there, but there are no othe= r error messages.=C2=A0I suggest revising the current solution where the sa= ving of the metadata is triggered in a Shutdown hook. -- This message was sent by Atlassian Jira (v8.3.4#803005)