Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5B6EC200BFE for ; Mon, 16 Jan 2017 15:18:53 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 59DCD160B30; Mon, 16 Jan 2017 14:18:53 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AC1AF160B22 for ; Mon, 16 Jan 2017 15:18:51 +0100 (CET) Received: (qmail 1421 invoked by uid 500); 16 Jan 2017 14:18:50 -0000 Mailing-List: contact commits-help@metron.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@metron.incubator.apache.org Delivered-To: mailing list commits@metron.incubator.apache.org Received: (qmail 1412 invoked by uid 99); 16 Jan 2017 14:18:50 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jan 2017 14:18:50 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 4A2D51A00E6 for ; Mon, 16 Jan 2017 14:18:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -6.218 X-Spam-Level: X-Spam-Status: No, score=-6.218 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id qpK8yLQ0J9pS for ; Mon, 16 Jan 2017 14:18:43 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id 09F955FB5B for ; Mon, 16 Jan 2017 14:18:42 +0000 (UTC) Received: (qmail 1378 invoked by uid 99); 16 Jan 2017 14:18:42 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jan 2017 14:18:42 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 632DCDFBE6; Mon, 16 Jan 2017 14:18:42 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: cestella@apache.org To: commits@metron.incubator.apache.org Message-Id: <36a4ec242c2b41b4a215520fdbace034@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: incubator-metron git commit: METRON-532 Define Profile Period When Calling PROFILE_GET closes apache/incubator-metron#414 Date: Mon, 16 Jan 2017 14:18:42 +0000 (UTC) archived-at: Mon, 16 Jan 2017 14:18:53 -0000 Repository: incubator-metron Updated Branches: refs/heads/master 56ff50c3d -> 9ec2cdcdc METRON-532 Define Profile Period When Calling PROFILE_GET closes apache/incubator-metron#414 Project: http://git-wip-us.apache.org/repos/asf/incubator-metron/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-metron/commit/9ec2cdcd Tree: http://git-wip-us.apache.org/repos/asf/incubator-metron/tree/9ec2cdcd Diff: http://git-wip-us.apache.org/repos/asf/incubator-metron/diff/9ec2cdcd Branch: refs/heads/master Commit: 9ec2cdcdc4675737fec15f9e958b46f17268a227 Parents: 56ff50c Author: mattf-horton Authored: Mon Jan 16 09:18:38 2017 -0500 Committer: cstella Committed: Mon Jan 16 09:18:38 2017 -0500 ---------------------------------------------------------------------- .../metron-profiler-client/README.md | 89 +++++++-- .../profiler/client/stellar/GetProfile.java | 179 +++++++++++++++--- .../metron/profiler/client/GetProfileTest.java | 182 ++++++++++++++++++- metron-analytics/metron-profiler/README.md | 30 ++- metron-platform/metron-common/README.md | 5 +- 5 files changed, 427 insertions(+), 58 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler-client/README.md ---------------------------------------------------------------------- diff --git a/metron-analytics/metron-profiler-client/README.md b/metron-analytics/metron-profiler-client/README.md index 8a55739..105fce9 100644 --- a/metron-analytics/metron-profiler-client/README.md +++ b/metron-analytics/metron-profiler-client/README.md @@ -22,30 +22,51 @@ These examples assume a profile has been defined called 'snort-alerts' that trac } ``` -During model scoring the entity being scored, in this case a particular IP address, will be known. The following examples highlight how this profile data might be retrieved. +During model scoring the entity being scored, in this case a particular IP address, will be known. The following examples shows how this profile data might be retrieved. -Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 4 hours. +The Stellar client consists of the `PROFILE_GET` command, which takes the following arguments: ``` -PROFILE_GET('snort-alerts', '10.0.0.1', 4, 'HOURS') +REQUIRED: + profile - The name of the profile + entity - The name of the entity + durationAgo - How long ago should values be retrieved from? + units - The units of 'durationAgo' +OPTIONAL: + groups_list - Optional, must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of + groupBy values used to filter the profile. Default is the empty list, meaning groupBy was not used when + creating the profile. + config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter + of the same name. Default is the empty Map, meaning no overrides. ``` +There is an older calling format where `groups_list` is specified as a sequence of group names, "varargs" style, instead of a List object. This format is still supported for backward compatibility, but it is deprecated, and it is disallowed if the optional `config_overrides` argument is used. -Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 2 days. +### Groups_list argument +The `groups_list` argument in the client must exactly correspond to the [`groupBy`](../metron-profiler#groupby) configuration in the profile definition. If `groupBy` was not used in the profile, `groups_list` must be empty in the client. If `groupBy` was used in the profile, then the client `groups_list` is not optional; it must be the same length as the `groupBy` list, and specify exactly one selected group value for each `groupBy` criterion, in the same order. For example: ``` -PROFILE_GET('snort-alerts', '10.0.0.1', 2, 'DAYS') +If in Profile, the groupBy criteria are: [ “DAY_OF_WEEK()”, “URL_TO_PORT()” ] +Then in PROFILE_GET, an allowed groups value would be: [ “3”, “8080” ] +which will select only records from Tuesdays with port number 8080. ``` -If the profile had been defined to group the data by weekday versus weekend, then the following example would apply. +### Configuration and the config_overrides argument -Retrieve all values of 'snort-alerts' from '10.0.0.1' that occurred on 'weekdays' over the past month. -``` -PROFILE_GET('snort-alerts', '10.0.0.1', 1, 'MONTHS', 'weekdays') -``` +By default, the Profiler creates profiles with a period duration of 15 minutes. This means that data is accumulated, summarized and flushed every 15 minutes. +The Client API must also have knowledge of this duration to correctly retrieve the profile data. If the Client is expecting 15 minute periods, it will not be +able to read data generated by a Profiler that was configured for 1 hour periods, and will return zero results. -### Configuration +Similarly, all six Client configuration parameters listed in the table below must match the Profiler configuration parameter settings from the time the profile +was created. The period duration and other configuration parameters from the Profiler topology are stored in local filesystem at `$METRON_HOME/config/profiler.properties`. +The Stellar Client API can be configured correspondingly by setting the following properties in Metron's global configuration, on local filesystem at +`$METRON_HOME/config/zookeeper/global.json`, then uploaded to Zookeeper (at `/metron/topology/global`) by using `zk_load_configs.sh`: -By default, the Profiler creates Profiles with a period duration of 15 minutes. This means that data is accumulated, summarized and flushed every 15 minutes. The Client API must also have knowledge of this duration to correctly retrieve the profile data. If the client API is expecting 15 minute periods, it will not be able to read data generated by a Profiler that has been configured with a 1 hour period. + ``` + $ cd $METRON_HOME + $ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181 + ``` -The period duration can be configured in the Profiler by altering the Profiler topology's static properties file (`$METRON/config/profiler.properties`). The Stellar Client API can be configured by setting the following properties in Metron's global configuration. +Any of these six Client configuration parameters may be overridden at run time using the `config_overrides` Map argument in PROFILE_GET. The primary use case is +when historical profiles have been created with a different Profiler configuration than is currently configured, and the analyst needing to access them does not +want to change the global Client configuration so as not to disrupt the work of other analysts working with current profiles. | Key | Description | Required | Default | | ------------------------------------- | -------- | -------- | -------- | @@ -56,6 +77,40 @@ The period duration can be configured in the Profiler by altering the Profiler t | profiler.client.salt.divisor | The salt divisor used to store profile data. | Optional | 1000 | | hbase.provider.impl | The name of the HBaseTableProvider implementation class. | Optional | | +### Errors +The most common result of incorrect PROFILE_GET arguments or Client configuration parameters is an empty result set, rather than an error. The Client cannot effectively validate the arguments, because the Profiler configuration parameters may be changed and the profile itself does not store them. The person doing the querying must carry forward the knowledge of the Profiler configuration parameters from the time of profile creation, and use corresponding PROFILE_GET arguments and Client configuration parameters when querying the data. + +### Examples +Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 4 hours. +``` +PROFILE_GET('snort-alerts', '10.0.0.1', 4, 'HOURS') +``` + +Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 2 days. +``` +PROFILE_GET('snort-alerts', '10.0.0.1', 2, 'DAYS') +``` + +If the profile had been defined to group the data by weekday versus weekend, then the following example would apply: + +Retrieve all values of 'snort-alerts' from '10.0.0.1' that occurred on 'weekdays' over the past month. +``` +PROFILE_GET('snort-alerts', '10.0.0.1', 1, 'MONTHS', ['weekdays'] ) +``` + +The client may need to use a configuration different from the current Client configuration settings. For example, perhaps you are on a cluster shared with other analysts, and need to access a profile that was constructed 2 months ago using different period duration, while they are accessing more recent profiles constructed with the currently configured period duration. For this situation, you may use the `config_overrides` argument: + +Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 2 days, with no `groupBy`, and overriding the usual global client configuration parameters for window duration. +``` +PROFILE_GET('profile1', 'entity1', 2, 'DAYS', [], {'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}) +``` + +Retrieve all values of 'snort-alerts' from '10.0.0.1' that occurred on 'weekdays' over the past month, overriding the usual global client configuration parameters for window duration. +``` +PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', ['weekdays'], {'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}) +``` + + ## Getting Started These instructions step through the process of using the Stellar Client API on a live cluster. These instructions assume that the 'Getting Started' instructions included with the [Metron Profiler](../metron-profiler) have been followed. This will create a Profile called 'test' whose data will be retrieved with the Stellar Client API. @@ -78,9 +133,13 @@ Arguments: entity - The name of the entity. durationAgo - How long ago should values be retrieved from? units - The units of 'durationAgo'. - groups - Optional - The groups used to sort the profile. + groups_list - Optional, must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of + groupBy values used to filter the profile. Default is the empty list, meaning groupBy was not used when + creating the profile. + config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter + of the same name. Default is the empty Map, meaning no overrides. -Returns: The profile measurements. +Returns: The selected profile measurements. [Stellar]>>> PROFILE_GET('test','192.168.138.158', 1, 'HOURS') [12078.0, 8921.0, 12131.0] http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java ---------------------------------------------------------------------- diff --git a/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java b/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java index 9d4aa54..beb55e0 100644 --- a/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java +++ b/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java @@ -41,6 +41,7 @@ import org.slf4j.LoggerFactory; import java.io.IOException; import java.util.ArrayList; +import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.concurrent.TimeUnit; @@ -65,7 +66,17 @@ import static org.apache.metron.common.dsl.Context.Capabilities.GLOBAL_CONFIG; * * Retrieve all values for 'entity1' from 'profile1' that occurred on 'weekdays' over the past month. * - * PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', 'weekdays') + * PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', ['weekdays']) + * + * Retrieve all values for 'entity1' from 'profile1' over the past 2 days, with no 'groupBy', + * and overriding the usual global client configuration parameters for window duration. + * + * PROFILE_GET('profile1', 'entity1', 2, 'DAYS', [], {'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}) + * + * Retrieve all values for 'entity1' from 'profile1' that occurred on 'weekdays' over the past month, + * overriding the usual global client configuration parameters for window duration. + * + * PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', ['weekdays'], {'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}) * */ @Stellar( @@ -77,9 +88,13 @@ import static org.apache.metron.common.dsl.Context.Capabilities.GLOBAL_CONFIG; "entity - The name of the entity.", "durationAgo - How long ago should values be retrieved from?", "units - The units of 'durationAgo'.", - "groups - Optional - The groups used to sort the profile." + "groups_list - Optional, must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of "+ + "groupBy values used to filter the profile. Default is the " + + "empty list, meaning groupBy was not used when creating the profile.", + "config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter " + + "of the same name. Default is the empty Map, meaning no overrides." }, - returns="The profile measurements." + returns="The selected profile measurements." ) public class GetProfile implements StellarFunction { @@ -116,6 +131,16 @@ public class GetProfile implements StellarFunction { public static final String PROFILER_SALT_DIVISOR = "profiler.client.salt.divisor"; /** + * The default Profile HBase table name should none be defined in the global properties. + */ + public static final String PROFILER_HBASE_TABLE_DEFAULT = "profiler"; + + /** + * The default Profile column family name should none be defined in the global properties. + */ + public static final String PROFILER_COLUMN_FAMILY_DEFAULT = "P"; + + /** * The default Profile period duration should none be defined in the global properties. */ public static final String PROFILER_PERIOD_DEFAULT = "15"; @@ -130,30 +155,24 @@ public class GetProfile implements StellarFunction { */ public static final String PROFILER_SALT_DIVISOR_DEFAULT = "1000"; - private static final Logger LOG = LoggerFactory.getLogger(GetProfile.class); - /** - * A client that can retrieve profile values. + * Cached client that can retrieve profile values. */ private ProfilerClient client; /** - * Initialization. + * Cached value of config map actually used to construct the previously cached client. */ - @Override - public void initialize(Context context) { + private Map cachedConfigMap = new HashMap(6); - // ensure the required capabilities are defined - Context.Capabilities[] required = { GLOBAL_CONFIG }; - validateCapabilities(context, required); - @SuppressWarnings("unchecked") - Map global = (Map) context.getCapability(GLOBAL_CONFIG).get(); + private static final Logger LOG = LoggerFactory.getLogger(GetProfile.class); - // create the profiler client - RowKeyBuilder rowKeyBuilder = getRowKeyBuilder(global); - ColumnBuilder columnBuilder = getColumnBuilder(global); - HTableInterface table = getTable(global); - client = new HBaseProfilerClient(table, rowKeyBuilder, columnBuilder); + /** + * Initialization. No longer need to do anything in initialization, + * as all setup is done lazily and cached. + */ + @Override + public void initialize(Context context) { } /** @@ -161,7 +180,7 @@ public class GetProfile implements StellarFunction { */ @Override public boolean isInitialized() { - return client != null; + return true; } /** @@ -177,12 +196,118 @@ public class GetProfile implements StellarFunction { long durationAgo = getArg(2, Long.class, args); String unitsName = getArg(3, String.class, args); TimeUnit units = TimeUnit.valueOf(unitsName); - List groups = getGroupsArg(4, args); + //Optional arguments + @SuppressWarnings("unchecked") + List groups = null; + Map configOverridesMap = null; + if (args.size() < 5) { + // no optional args, so default 'groups' and configOverridesMap remains null. + groups = new ArrayList<>(0); + } + else if (args.get(4) instanceof List) { + // correct extensible usage + groups = getArg(4, List.class, args); + if (args.size() >= 6) { + configOverridesMap = getArg(5, Map.class, args); + if (configOverridesMap.isEmpty()) configOverridesMap = null; + } + } + else { + // Deprecated "varargs" style usage for groups_list + // configOverridesMap cannot be specified so it remains null. + groups = getGroupsArg(4, args); + } + + Map effectiveConfig = getEffectiveConfig(context, configOverridesMap); + + //lazily create new profiler client if needed + if (client == null || !cachedConfigMap.equals(effectiveConfig)) { + RowKeyBuilder rowKeyBuilder = getRowKeyBuilder(effectiveConfig); + ColumnBuilder columnBuilder = getColumnBuilder(effectiveConfig); + HTableInterface table = getTable(effectiveConfig); + client = new HBaseProfilerClient(table, rowKeyBuilder, columnBuilder); + cachedConfigMap = effectiveConfig; + } return client.fetch(Object.class, profile, entity, groups, durationAgo, units); } /** + * Merge the configuration parameter override Map into the config from global context, + * and return the result. This has to be done on each call, because either may have changed. + * + * Only the six recognized profiler client config parameters may be set, + * all other key-value pairs in either Map will be ignored. + * + * Type violations cause a Stellar ParseException. + * + * @param context - from which we get the global config Map. + * @param configOverridesMap - Map of overrides as described above. + * @return effective config Map with overrides applied. + * @throws ParseException - if any override values are of wrong type. + */ + private Map getEffectiveConfig( + Context context + , Map configOverridesMap + ) throws ParseException { + + final String[] KEYLIST = { + PROFILER_HBASE_TABLE, PROFILER_COLUMN_FAMILY, + PROFILER_HBASE_TABLE_PROVIDER, PROFILER_PERIOD, + PROFILER_PERIOD_UNITS, PROFILER_SALT_DIVISOR}; + + // ensure the required capabilities are defined + final Context.Capabilities[] required = { GLOBAL_CONFIG }; + validateCapabilities(context, required); + @SuppressWarnings("unchecked") + Map global = (Map) context.getCapability(GLOBAL_CONFIG).get(); + + Map result = new HashMap(6); + Object v; + + // extract the relevant parameters from global + for (String k : KEYLIST) { + v = global.get(k); + if (v != null) result.put(k, v); + } + if (configOverridesMap == null) return result; + + // extract override values, typechecking as we go + try { + for (Object key : configOverridesMap.keySet()) { + if (!(key instanceof String)) { + // Probably unintended user error, so throw an exception rather than ignore + throw new ParseException("Non-string key in config_overrides map is not allowed: " + key.toString()); + } + switch ((String) key) { + case PROFILER_HBASE_TABLE: + case PROFILER_COLUMN_FAMILY: + case PROFILER_HBASE_TABLE_PROVIDER: + case PROFILER_PERIOD_UNITS: + v = configOverridesMap.get(key); + v = ConversionUtils.convert(v, String.class); + result.put((String) key, v); + break; + case PROFILER_PERIOD: + case PROFILER_SALT_DIVISOR: + // be tolerant if the user put a number instead of a string + // regardless, validate that it is an integer value + v = configOverridesMap.get(key); + long vlong = ConversionUtils.convert(v, Long.class); + result.put((String) key, String.valueOf(vlong)); + break; + default: + LOG.warn("Ignoring unallowed key {} in config_overrides map.", key); + break; + } + } + } catch (ClassCastException | NumberFormatException cce) { + throw new ParseException("Type violation in config_overrides map values: ", cce); + } + return result; + } + + /** * Get the groups defined by the user. * * The user can specify 0 or more groups. All arguments from the specified position @@ -244,16 +369,10 @@ public class GetProfile implements StellarFunction { * @param global The global configuration. */ private ColumnBuilder getColumnBuilder(Map global) { - // the builder is not currently configurable - but should be made so ColumnBuilder columnBuilder; - if(global.containsKey(PROFILER_COLUMN_FAMILY)) { - String columnFamily = (String) global.get(PROFILER_COLUMN_FAMILY); - columnBuilder = new ValueOnlyColumnBuilder(columnFamily); - - } else { - columnBuilder = new ValueOnlyColumnBuilder(); - } + String columnFamily = (String) global.getOrDefault(PROFILER_COLUMN_FAMILY, PROFILER_COLUMN_FAMILY_DEFAULT); + columnBuilder = new ValueOnlyColumnBuilder(columnFamily); return columnBuilder; } @@ -289,7 +408,7 @@ public class GetProfile implements StellarFunction { */ private HTableInterface getTable(Map global) { - String tableName = (String) global.getOrDefault(PROFILER_HBASE_TABLE, "profiler"); + String tableName = (String) global.getOrDefault(PROFILER_HBASE_TABLE, PROFILER_HBASE_TABLE_DEFAULT); TableProvider provider = getTableProvider(global); try { http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java ---------------------------------------------------------------------- diff --git a/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java b/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java index 4bb3420..960795b 100644 --- a/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java +++ b/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java @@ -69,6 +69,10 @@ public class GetProfileTest { private StellarExecutor executor; private Map state; private ProfileWriter profileWriter; + // different values of period and salt divisor, used to test config_overrides feature + private static final long periodDuration2 = 1; + private static final TimeUnit periodUnits2 = TimeUnit.HOURS; + private static final int saltDivisor2 = 2050; /** * A TableProvider that allows us to mock HBase. @@ -87,6 +91,17 @@ public class GetProfileTest { return executor.execute(expression, state, clazz); } + /** + * This method sets up the configuration context for both writing profile data + * (using profileWriter to mock the complex process of what the Profiler topology + * actually does), and then reading that profile data (thereby testing the PROFILE_GET + * Stellar client implemented in GetProfile). + * + * It runs at @Before time, and sets testclass global variables used by the writers and readers. + * The various writers and readers are in each test case, not here. + * + * @return void + */ @Before public void setup() { state = new HashMap<>(); @@ -117,6 +132,51 @@ public class GetProfileTest { } /** + * This method is similar to setup(), in that it sets up profiler configuration context, + * but only for the client. Additionally, it uses periodDuration2, periodUnits2 + * and saltDivisor2, instead of periodDuration, periodUnits and saltDivisor respectively. + * + * This is used in the unit tests that test the config_overrides feature of PROFILE_GET. + * In these tests, the context from @Before setup() is used to write the data, then the global + * context is changed to context2 (from this method). Each test validates that a default read + * using global context2 then gets no valid results (as expected), and that a read using + * original context values in the PROFILE_GET config_overrides argument gets all expected results. + * + * @return context2 - The profiler client configuration context created by this method. + * The context2 values are also set in the configuration of the StellarExecutor + * stored in the global variable 'executor'. However, there is no API for querying the + * context values from a StellarExecutor, so we output the context2 Context object itself, + * for validation purposes (so that its values can be validated as being significantly + * different from the setup() settings). + */ + private Context setup2() { + state = new HashMap<>(); + + // global properties + Map global = new HashMap() {{ + put(PROFILER_HBASE_TABLE, tableName); + put(PROFILER_COLUMN_FAMILY, columnFamily); + put(PROFILER_HBASE_TABLE_PROVIDER, MockTableProvider.class.getName()); + put(PROFILER_PERIOD, Long.toString(periodDuration2)); + put(PROFILER_PERIOD_UNITS, periodUnits2.toString()); + put(PROFILER_SALT_DIVISOR, Integer.toString(saltDivisor2)); + }}; + + // create the modified context + Context context2 = new Context.Builder() + .with(Context.Capabilities.GLOBAL_CONFIG, () -> global) + .build(); + + // create the stellar execution environment + executor = new DefaultStellarExecutor( + new SimpleFunctionResolver() + .withClass(GetProfile.class), + context2); + + return context2; //because there is no executor.getContext() method + } + + /** * Values should be retrievable that have NOT been stored within a group. */ @Test @@ -168,12 +228,19 @@ public class GetProfileTest { state.put("groups", group); // execute - read the profile values - String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekends')"; + String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekends'])"; @SuppressWarnings("unchecked") List result = run(expr, List.class); // validate - expect to read all values from the past 4 hours Assert.assertEquals(count, result.size()); + + // test the deprecated but allowed "varargs" form of groups specification + expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekends')"; + result = run(expr, List.class); + + // validate - expect to read all values from the past 4 hours + Assert.assertEquals(count, result.size()); } /** @@ -199,12 +266,19 @@ public class GetProfileTest { state.put("groups", group); // execute - read the profile values - String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekdays', 'tuesday')"; + String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekdays', 'tuesday'])"; @SuppressWarnings("unchecked") List result = run(expr, List.class); // validate - expect to read all values from the past 4 hours Assert.assertEquals(count, result.size()); + + // test the deprecated but allowed "varargs" form of groups specification + expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekdays', 'tuesday')"; + result = run(expr, List.class); + + // validate - expect to read all values from the past 4 hours + Assert.assertEquals(count, result.size()); } /** @@ -254,4 +328,108 @@ public class GetProfileTest { // validate - there should be no values from only 4 seconds ago Assert.assertEquals(0, result.size()); } + + /** + * Values should be retrievable that were written with configuration different than current global config. + */ + @Test + public void testWithConfigOverride() { + final int periodsPerHour = 4; + final int expectedValue = 2302; + final int hours = 2; + final long startTime = System.currentTimeMillis() - TimeUnit.HOURS.toMillis(hours); + final List group = Collections.emptyList(); + + // setup - write some measurements to be read later + final int count = hours * periodsPerHour; + ProfileMeasurement m = new ProfileMeasurement() + .withProfileName("profile1") + .withEntity("entity1") + .withPeriod(startTime, periodDuration, periodUnits); + profileWriter.write(m, count, group, val -> expectedValue); + + // now change the executor configuration + Context context2 = setup2(); + // validate it is changed in significant way + @SuppressWarnings("unchecked") + Map global = (Map) context2.getCapability(Context.Capabilities.GLOBAL_CONFIG).get(); + Assert.assertEquals(global.get(PROFILER_PERIOD), Long.toString(periodDuration2)); + Assert.assertNotEquals(periodDuration, periodDuration2); + + // execute - read the profile values - with (wrong) default global config values. + // No error message at this time, but returns empty results list, because + // row keys are not correctly calculated. + String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS')"; + @SuppressWarnings("unchecked") + List result = run(expr, List.class); + + // validate - expect to fail to read any values + Assert.assertEquals(0, result.size()); + + // execute - read the profile values - with config_override. + // first two override values are strings, third is deliberately a number. + expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', [], {" + + "'profiler.client.period.duration' : '" + periodDuration + "', " + + "'profiler.client.period.duration.units' : '" + periodUnits.toString() + "', " + + "'profiler.client.salt.divisor' : " + saltDivisor + " })"; + result = run(expr, List.class); + + // validate - expect to read all values from the past 4 hours + Assert.assertEquals(count, result.size()); + } + + /** + * Values should be retrievable that have been stored within a 'group', with + * configuration different than current global config. + * This time put the config_override case before the non-override case. + */ + @Test + public void testWithConfigAndOneGroup() { + final int periodsPerHour = 4; + final int expectedValue = 2302; + final int hours = 2; + final long startTime = System.currentTimeMillis() - TimeUnit.HOURS.toMillis(hours); + final List group = Arrays.asList("weekends"); + + // setup - write some measurements to be read later + final int count = hours * periodsPerHour; + ProfileMeasurement m = new ProfileMeasurement() + .withProfileName("profile1") + .withEntity("entity1") + .withPeriod(startTime, periodDuration, periodUnits); + profileWriter.write(m, count, group, val -> expectedValue); + + // create a variable that contains the groups to use + state.put("groups", group); + + // now change the executor configuration + Context context2 = setup2(); + // validate it is changed in significant way + @SuppressWarnings("unchecked") + Map global = (Map) context2.getCapability(Context.Capabilities.GLOBAL_CONFIG).get(); + Assert.assertEquals(global.get(PROFILER_PERIOD), Long.toString(periodDuration2)); + Assert.assertNotEquals(periodDuration, periodDuration2); + + // execute - read the profile values - with config_override. + // first two override values are strings, third is deliberately a number. + String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekends'], {" + + "'profiler.client.period.duration' : '" + periodDuration + "', " + + "'profiler.client.period.duration.units' : '" + periodUnits.toString() + "', " + + "'profiler.client.salt.divisor' : " + saltDivisor + " })"; + @SuppressWarnings("unchecked") + List result = run(expr, List.class); + + // validate - expect to read all values from the past 4 hours + Assert.assertEquals(count, result.size()); + + // execute - read the profile values - with (wrong) default global config values. + // No error message at this time, but returns empty results list, because + // row keys are not correctly calculated. + expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekends'])"; + result = run(expr, List.class); + + // validate - expect to fail to read any values + Assert.assertEquals(0, result.size()); + } + } http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler/README.md ---------------------------------------------------------------------- diff --git a/metron-analytics/metron-profiler/README.md b/metron-analytics/metron-profiler/README.md index 2d347d2..04e1c0d 100644 --- a/metron-analytics/metron-profiler/README.md +++ b/metron-analytics/metron-profiler/README.md @@ -24,6 +24,12 @@ This section will describe the steps required to get your first profile running. hbase(main):001:0> create 'profiler', 'P' ``` +1. Edit the configuration file located at `$METRON_HOME/config/profiler.properties`. Change the kafka.zk and kafka.broker values from "node1" to the appropriate host name. Keep the same port numbers: + ``` + kafka.zk=node1:2181 + kafka.broker=node1:6667 + ``` + 1. Define the profile in a file located at `$METRON_HOME/config/zookeeper/profiler.json`. The following example JSON will create a profile that simply counts the number of messages per `ip_src_addr`, during each sampling interval. ``` { @@ -31,7 +37,7 @@ This section will describe the steps required to get your first profile running. { "profile": "test", "foreach": "ip_src_addr", - "init": { "count": 0 }, + "init": { "count": "0" }, "update": { "count": "count + 1" }, "result": "count" } @@ -39,7 +45,7 @@ This section will describe the steps required to get your first profile running. } ``` -1. Upload the profile definition to Zookeeper. +1. Upload the profile definition to Zookeeper. (As always, change "node1" to the actual hostname.) ``` $ cd $METRON_HOME $ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181 @@ -58,7 +64,9 @@ This section will describe the steps required to get your first profile running. hbase(main):001:0> count 'profiler' ``` -1. Use the Profiler Client to read the profile data. The below example `PROFILE_GET` command will read data written by the sample profile given above, if 10.0.0.1 is one of the input values for `ip_src_addr`. More information on using the client can be found [here](../metron-profiler-client). +1. Use the Profiler Client to read the profile data. The below example `PROFILE_GET` command will read data written by the sample profile given above, if 10.0.0.1 is one of the input values for `ip_src_addr`. +More information on configuring and using the client can be found [here](../metron-profiler-client). +It is assumed that the PROFILE_GET client is correctly configured before using it. ``` $ bin/stellar -z node1:2181 @@ -68,7 +76,9 @@ This section will describe the steps required to get your first profile running. ## Creating Profiles -The Profiler configuration requires a JSON-formatted set of elements, many of which can contain Stellar code. The configuration contains the following elements. For the impatient, skip ahead to the [Examples](#examples). +The Profiler specification requires a JSON-formatted set of elements, many of which can contain Stellar code. The specification contains the following elements. (For the impatient, skip ahead to the [Examples](#examples).) +The specification for the Profiler topology is stored in Zookeeper at `/metron/topology/profiler`. These properties also exist in the local filesystem at `$METRON_HOME/config/zookeeper/profiler.json`. +The values can be changed on disk and then uploaded to Zookeeper using `$METRON_HOME/bin/zk_load_configs.sh`. | Name | | Description |--- |--- |--- @@ -117,7 +127,7 @@ The 'groupBy' expressions can refer to any field within a `org.apache.metron.pro *Optional* -One or more expressions executed at the start of a window period. A map is expected where the key is the variable name and the value is a Stellar expression. The map can contain 0 or more variables/expressions. At the start of each window period the expression is executed once and stored in a variable with the given name. +One or more expressions executed at the start of a window period. A map is expected where the key is the variable name and the value is a Stellar expression. The map can contain zero or more variable:expression pairs. At the start of each window period, each expression is executed once and stored in the given variable. Note that constant init values such as "0" must be in quotes regardless of their type, as the init value must be a string to be executed by Stellar. ``` "init": { @@ -143,7 +153,7 @@ One or more expressions executed when a message is applied to the profile. A ma *Required* -A Stellar expression that is executed when the window period expires. The expression is expected to summarize the messages that were applied to the profile over the window period. The expression must result in a numeric value such as a Double, Long, Float, Short, or Integer. +A Stellar expression that is executed when the window period expires. The expression is expected to summarize the messages that were applied to the profile over the window period, using the state accumulated by the updates. The result will typically be a single numeric value, but it may be any serializable object, as shown in Example 4 below. ### `expires` @@ -153,7 +163,8 @@ A numeric value that defines how many days the profile data is retained. After ## Configuring the Profiler -The Profiler runs as an independent Storm topology. The configuration for the Profiler topology is stored in Zookeeper at `/metron/topology/profiler`. These properties also exist in the the default installation of Metron at `$METRON_HOME/config/zookeeper/profiler.json`. The values can be changed on disk and then uploaded to Zookeeper using `$METRON_HOME/bin/zk_load_configs.sh`. +The Profiler runs as an independent Storm topology. The configuration for the Profiler topology is stored in local filesystem at `$METRON_HOME/config/profiler.properties`. +The values can be changed on disk and then the Profiler topology must be restarted. | Setting | Description | |--- |--- | @@ -300,7 +311,7 @@ This creates a profile... It is important to note that the Profiler can persist any serializable Object, not just numeric values. An alternative to the previous example could take advantage of this. -Instead of storing the mean of the length, the profile could store a more generic summary of the length. This summary can then be used at a later time to calculate the mean, min, max, percentiles, or any other sensible metric. This provides a much greater degree of flexibility. +Instead of storing the mean of the lengths, the profile could store a statistical summarization of the lengths. This summary can then be used at a later time to calculate the mean, min, max, percentiles, or any other sensible metric. This provides a much greater degree of flexibility. ``` { @@ -316,7 +327,8 @@ Instead of storing the mean of the length, the profile could store a more generi } ``` -The following Stellar REPL session shows how you might use this summary to calculate different metrics with the same underlying profile data. +The following Stellar REPL session shows how you might use this summary to calculate different metrics with the same underlying profile data. +It is assumed that the PROFILE_GET client is configured as described [here](../metron-profiler-client). Retrieve the last 30 minutes of profile measurements for a specific host. ``` http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-platform/metron-common/README.md ---------------------------------------------------------------------- diff --git a/metron-platform/metron-common/README.md b/metron-platform/metron-common/README.md index 019c538..cbf0180 100644 --- a/metron-platform/metron-common/README.md +++ b/metron-platform/metron-common/README.md @@ -413,8 +413,9 @@ MAP_GET` * entity - The name of the entity. * durationAgo - How long ago should values be retrieved from? * units - The units of 'durationAgo'. - * groups - Optional - The groups used to sort the profile. - * Returns: The profile measurements. + * groups_list - Optional, must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of groupBy values used to filter the profile. Default is the empty list, meaning groupBy was not used when creating the profile. + * config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter of the same name. Default is the empty Map, meaning no overrides. + * Returns: The selected profile measurements. ### `PROTOCOL_TO_NAME` * Description: Converts the IANA protocol number to the protocol name