metron-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ceste...@apache.org
Subject incubator-metron git commit: METRON-532 Define Profile Period When Calling PROFILE_GET closes apache/incubator-metron#414
Date Mon, 16 Jan 2017 14:18:42 GMT
Repository: incubator-metron
Updated Branches:
  refs/heads/master 56ff50c3d -> 9ec2cdcdc


METRON-532 Define Profile Period When Calling PROFILE_GET closes apache/incubator-metron#414


Project: http://git-wip-us.apache.org/repos/asf/incubator-metron/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-metron/commit/9ec2cdcd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-metron/tree/9ec2cdcd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-metron/diff/9ec2cdcd

Branch: refs/heads/master
Commit: 9ec2cdcdc4675737fec15f9e958b46f17268a227
Parents: 56ff50c
Author: mattf-horton <mfoley@hortonworks.com>
Authored: Mon Jan 16 09:18:38 2017 -0500
Committer: cstella <cestella@gmail.com>
Committed: Mon Jan 16 09:18:38 2017 -0500

----------------------------------------------------------------------
 .../metron-profiler-client/README.md            |  89 +++++++--
 .../profiler/client/stellar/GetProfile.java     | 179 +++++++++++++++---
 .../metron/profiler/client/GetProfileTest.java  | 182 ++++++++++++++++++-
 metron-analytics/metron-profiler/README.md      |  30 ++-
 metron-platform/metron-common/README.md         |   5 +-
 5 files changed, 427 insertions(+), 58 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler-client/README.md
----------------------------------------------------------------------
diff --git a/metron-analytics/metron-profiler-client/README.md b/metron-analytics/metron-profiler-client/README.md
index 8a55739..105fce9 100644
--- a/metron-analytics/metron-profiler-client/README.md
+++ b/metron-analytics/metron-profiler-client/README.md
@@ -22,30 +22,51 @@ These examples assume a profile has been defined called 'snort-alerts'
that trac
 }
 ```
 
-During model scoring the entity being scored, in this case a particular IP address, will
be known.  The following examples highlight how this profile data might be retrieved.
+During model scoring the entity being scored, in this case a particular IP address, will
be known.  The following examples shows how this profile data might be retrieved.
 
-Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 4 hours.
+The Stellar client consists of the `PROFILE_GET` command, which takes the following arguments:
 ```
-PROFILE_GET('snort-alerts', '10.0.0.1', 4, 'HOURS')
+REQUIRED:
+    profile - The name of the profile
+    entity - The name of the entity
+    durationAgo - How long ago should values be retrieved from?
+    units - The units of 'durationAgo'
+OPTIONAL:
+	groups_list - Optional, must correspond to the 'groupBy' list used in profile creation -
List (in square brackets) of 
+            groupBy values used to filter the profile. Default is the empty list, meaning
groupBy was not used when 
+            creating the profile.
+    config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding
the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
 ```
+There is an older calling format where `groups_list` is specified as a sequence of group
names, "varargs" style, instead of a List object.  This format is still supported for backward
compatibility, but it is deprecated, and it is disallowed if the optional `config_overrides`
argument is used.
 
-Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 2 days.
+### Groups_list argument
+The `groups_list` argument in the client must exactly correspond to the [`groupBy`](../metron-profiler#groupby)
configuration in the profile definition.  If `groupBy` was not used in the profile, `groups_list`
must be empty in the client.  If `groupBy` was used in the profile, then the client `groups_list`
is <b>not</b> optional; it must be the same length as the `groupBy` list, and
specify exactly one selected group value for each `groupBy` criterion, in the same order.
 For example:
 ```
-PROFILE_GET('snort-alerts', '10.0.0.1', 2, 'DAYS')
+If in Profile, the groupBy criteria are:  [ “DAY_OF_WEEK()”, “URL_TO_PORT()” ]
+Then in PROFILE_GET, an allowed groups value would be:  [ “3”, “8080” ]
+which will select only records from Tuesdays with port number 8080.
 ```
 
-If the profile had been defined to group the data by weekday versus weekend, then the following
example would apply.
+### Configuration and the config_overrides argument
 
-Retrieve all values of 'snort-alerts' from '10.0.0.1' that occurred on 'weekdays' over the
past month.
-```
-PROFILE_GET('snort-alerts', '10.0.0.1', 1, 'MONTHS', 'weekdays')
-```
+By default, the Profiler creates profiles with a period duration of 15 minutes. This means
that data is accumulated, summarized and flushed every 15 minutes. 
+The Client API must also have knowledge of this duration to correctly retrieve the profile
data. If the Client is expecting 15 minute periods, it will not be 
+able to read data generated by a Profiler that was configured for 1 hour periods, and will
return zero results.  
 
-### Configuration
+Similarly, all six Client configuration parameters listed in the table below must match the
Profiler configuration parameter settings from the time the profile 
+was created. The period duration and other configuration parameters from the Profiler topology
are stored in local filesystem at `$METRON_HOME/config/profiler.properties`. 
+The Stellar Client API can be configured correspondingly by setting the following properties
in Metron's global configuration, on local filesystem at
+`$METRON_HOME/config/zookeeper/global.json`, then uploaded to Zookeeper (at `/metron/topology/global`)
by using `zk_load_configs.sh`: 
 
-By default, the Profiler creates Profiles with a period duration of 15 minutes. This means
that data is accumulated, summarized and flushed every 15 minutes. The Client API must also
have knowledge of this duration to correctly retrieve the profile data. If the client API
is expecting 15 minute periods, it will not be able to read data generated by a Profiler that
has been configured with a 1 hour period.
+    ```
+    $ cd $METRON_HOME
+    $ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181
+    ```
 
-The period duration can be configured in the Profiler by altering the Profiler topology's
static properties file (`$METRON/config/profiler.properties`). The Stellar Client API can
be configured by setting the following properties in Metron's global configuration.
+Any of these six Client configuration parameters may be overridden at run time using the
`config_overrides` Map argument in PROFILE_GET. The primary use case is 
+when historical profiles have been created with a different Profiler configuration than is
currently configured, and the analyst needing to access them does not 
+want to change the global Client configuration so as not to disrupt the work of other analysts
working with current profiles.
 
 | Key                                   | Description                                   
                                                                                    | Required
| Default  |
 | ------------------------------------- | -------- | -------- | -------- |
@@ -56,6 +77,40 @@ The period duration can be configured in the Profiler by altering the Profiler
t
 | profiler.client.salt.divisor          | The salt divisor used to store profile data. |
Optional | 1000     |
 | hbase.provider.impl                   | The name of the HBaseTableProvider implementation
class. | Optional |          |
 
+### Errors
+The most common result of incorrect PROFILE_GET arguments or Client configuration parameters
is an empty result set, rather than an error.  The Client cannot effectively validate the
arguments, because the Profiler configuration parameters may be changed and the profile itself
does not store them.  The person doing the querying must carry forward the knowledge of the
Profiler configuration parameters from the time of profile creation, and use corresponding
PROFILE_GET arguments and Client configuration parameters when querying the data.
+
+### Examples
+Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 4 hours.
+```
+PROFILE_GET('snort-alerts', '10.0.0.1', 4, 'HOURS')
+```
+
+Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 2 days.
+```
+PROFILE_GET('snort-alerts', '10.0.0.1', 2, 'DAYS')
+```
+
+If the profile had been defined to group the data by weekday versus weekend, then the following
example would apply:
+
+Retrieve all values of 'snort-alerts' from '10.0.0.1' that occurred on 'weekdays' over the
past month.
+```
+PROFILE_GET('snort-alerts', '10.0.0.1', 1, 'MONTHS', ['weekdays'] )
+```
+
+The client may need to use a configuration different from the current Client configuration
settings.  For example, perhaps you are on a cluster shared with other analysts, and need
to access a profile that was constructed 2 months ago using different period duration, while
they are accessing more recent profiles constructed with the currently configured period duration.
 For this situation, you may use the `config_overrides` argument:
+
+Retrieve all values of 'snort-alerts' from '10.0.0.1' over the past 2 days, with no `groupBy`,
and overriding the usual global client configuration parameters for window duration.
+```
+PROFILE_GET('profile1', 'entity1', 2, 'DAYS', [], {'profiler.client.period.duration' : '2',
'profiler.client.period.duration.units' : 'MINUTES'})
+```
+
+Retrieve all values of 'snort-alerts' from '10.0.0.1' that occurred on 'weekdays' over the
past month, overriding the usual global client configuration parameters for window duration.
+```
+PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', ['weekdays'], {'profiler.client.period.duration'
: '2', 'profiler.client.period.duration.units' : 'MINUTES'})
+```
+
+
 ## Getting Started
 
 These instructions step through the process of using the Stellar Client API on a live cluster.
 These instructions assume that the 'Getting Started' instructions included with the [Metron
Profiler](../metron-profiler) have been followed.  This will create a Profile called 'test'
whose data will be retrieved with the Stellar Client API.
@@ -78,9 +133,13 @@ Arguments:
 	entity - The name of the entity.
 	durationAgo - How long ago should values be retrieved from?
 	units - The units of 'durationAgo'.
-	groups - Optional - The groups used to sort the profile.
+	groups_list - Optional, must correspond to the 'groupBy' list used in profile creation -
List (in square brackets) of 
+            groupBy values used to filter the profile. Default is the empty list, meaning
groupBy was not used when 
+            creating the profile.
+	config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding
the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
 
-Returns: The profile measurements.
+Returns: The selected profile measurements.
 
 [Stellar]>>> PROFILE_GET('test','192.168.138.158', 1, 'HOURS')
 [12078.0, 8921.0, 12131.0]

http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java
----------------------------------------------------------------------
diff --git a/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java
b/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java
index 9d4aa54..beb55e0 100644
--- a/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java
+++ b/metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java
@@ -41,6 +41,7 @@ import org.slf4j.LoggerFactory;
 
 import java.io.IOException;
 import java.util.ArrayList;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.TimeUnit;
@@ -65,7 +66,17 @@ import static org.apache.metron.common.dsl.Context.Capabilities.GLOBAL_CONFIG;
  *
  * Retrieve all values for 'entity1' from 'profile1' that occurred on 'weekdays' over the
past month.
  *
- *   <code>PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', 'weekdays')</code>
+ *   <code>PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', ['weekdays'])</code>
+ *
+ * Retrieve all values for 'entity1' from 'profile1' over the past 2 days, with no 'groupBy',
+ * and overriding the usual global client configuration parameters for window duration.
+ *
+ *   <code>PROFILE_GET('profile1', 'entity1', 2, 'DAYS', [], {'profiler.client.period.duration'
: '2', 'profiler.client.period.duration.units' : 'MINUTES'})</code>
+ *
+ * Retrieve all values for 'entity1' from 'profile1' that occurred on 'weekdays' over the
past month,
+ * overriding the usual global client configuration parameters for window duration.
+ *
+ *   <code>PROFILE_GET('profile1', 'entity1', 1, 'MONTHS', ['weekdays'], {'profiler.client.period.duration'
: '2', 'profiler.client.period.duration.units' : 'MINUTES'})</code>
  *
  */
 @Stellar(
@@ -77,9 +88,13 @@ import static org.apache.metron.common.dsl.Context.Capabilities.GLOBAL_CONFIG;
           "entity - The name of the entity.",
           "durationAgo - How long ago should values be retrieved from?",
           "units - The units of 'durationAgo'.",
-          "groups - Optional - The groups used to sort the profile."
+          "groups_list - Optional, must correspond to the 'groupBy' list used in profile
creation - List (in square brackets) of "+
+                  "groupBy values used to filter the profile. Default is the " +
+                  "empty list, meaning groupBy was not used when creating the profile.",
+          "config_overrides - Optional - Map (in curly braces) of name:value pairs, each
overriding the global config parameter " +
+                  "of the same name. Default is the empty Map, meaning no overrides."
         },
-        returns="The profile measurements."
+        returns="The selected profile measurements."
 )
 public class GetProfile implements StellarFunction {
 
@@ -116,6 +131,16 @@ public class GetProfile implements StellarFunction {
   public static final String PROFILER_SALT_DIVISOR = "profiler.client.salt.divisor";
 
   /**
+   * The default Profile HBase table name should none be defined in the global properties.
+   */
+  public static final String PROFILER_HBASE_TABLE_DEFAULT = "profiler";
+
+  /**
+   * The default Profile column family name should none be defined in the global properties.
+   */
+  public static final String PROFILER_COLUMN_FAMILY_DEFAULT = "P";
+
+  /**
    * The default Profile period duration should none be defined in the global properties.
    */
   public static final String PROFILER_PERIOD_DEFAULT = "15";
@@ -130,30 +155,24 @@ public class GetProfile implements StellarFunction {
    */
   public static final String PROFILER_SALT_DIVISOR_DEFAULT = "1000";
 
-  private static final Logger LOG = LoggerFactory.getLogger(GetProfile.class);
-
   /**
-   * A client that can retrieve profile values.
+   * Cached client that can retrieve profile values.
    */
   private ProfilerClient client;
 
   /**
-   * Initialization.
+   * Cached value of config map actually used to construct the previously cached client.
    */
-  @Override
-  public void initialize(Context context) {
+  private Map<String, Object> cachedConfigMap = new HashMap<String, Object>(6);
 
-    // ensure the required capabilities are defined
-    Context.Capabilities[] required = { GLOBAL_CONFIG };
-    validateCapabilities(context, required);
-    @SuppressWarnings("unchecked")
-    Map<String, Object> global = (Map<String, Object>) context.getCapability(GLOBAL_CONFIG).get();
+  private static final Logger LOG = LoggerFactory.getLogger(GetProfile.class);
 
-    // create the profiler client
-    RowKeyBuilder rowKeyBuilder = getRowKeyBuilder(global);
-    ColumnBuilder columnBuilder = getColumnBuilder(global);
-    HTableInterface table = getTable(global);
-    client = new HBaseProfilerClient(table, rowKeyBuilder, columnBuilder);
+  /**
+   * Initialization.  No longer need to do anything in initialization,
+   * as all setup is done lazily and cached.
+   */
+  @Override
+  public void initialize(Context context) {
   }
 
   /**
@@ -161,7 +180,7 @@ public class GetProfile implements StellarFunction {
    */
   @Override
   public boolean isInitialized() {
-    return client != null;
+    return true;
   }
 
   /**
@@ -177,12 +196,118 @@ public class GetProfile implements StellarFunction {
     long durationAgo = getArg(2, Long.class, args);
     String unitsName = getArg(3, String.class, args);
     TimeUnit units = TimeUnit.valueOf(unitsName);
-    List<Object> groups = getGroupsArg(4, args);
+    //Optional arguments
+    @SuppressWarnings("unchecked")
+    List<Object> groups = null;
+    Map configOverridesMap = null;
+    if (args.size() < 5) {
+      // no optional args, so default 'groups' and configOverridesMap remains null.
+      groups = new ArrayList<>(0);
+    }
+    else if (args.get(4) instanceof List) {
+      // correct extensible usage
+      groups = getArg(4, List.class, args);
+      if (args.size() >= 6) {
+        configOverridesMap = getArg(5, Map.class, args);
+        if (configOverridesMap.isEmpty()) configOverridesMap = null;
+      }
+    }
+    else {
+      // Deprecated "varargs" style usage for groups_list
+      // configOverridesMap cannot be specified so it remains null.
+      groups = getGroupsArg(4, args);
+    }
+
+    Map<String, Object> effectiveConfig = getEffectiveConfig(context, configOverridesMap);
+
+    //lazily create new profiler client if needed
+    if (client == null || !cachedConfigMap.equals(effectiveConfig)) {
+      RowKeyBuilder rowKeyBuilder = getRowKeyBuilder(effectiveConfig);
+      ColumnBuilder columnBuilder = getColumnBuilder(effectiveConfig);
+      HTableInterface table = getTable(effectiveConfig);
+      client = new HBaseProfilerClient(table, rowKeyBuilder, columnBuilder);
+      cachedConfigMap = effectiveConfig;
+    }
 
     return client.fetch(Object.class, profile, entity, groups, durationAgo, units);
   }
 
   /**
+   * Merge the configuration parameter override Map into the config from global context,
+   * and return the result.  This has to be done on each call, because either may have changed.
+   *
+   * Only the six recognized profiler client config parameters may be set,
+   * all other key-value pairs in either Map will be ignored.
+   *
+   * Type violations cause a Stellar ParseException.
+   *
+   * @param context - from which we get the global config Map.
+   * @param configOverridesMap - Map of overrides as described above.
+   * @return effective config Map with overrides applied.
+   * @throws ParseException - if any override values are of wrong type.
+   */
+  private Map<String, Object> getEffectiveConfig(
+              Context context
+              , Map configOverridesMap
+  ) throws ParseException {
+
+    final String[] KEYLIST = {
+            PROFILER_HBASE_TABLE, PROFILER_COLUMN_FAMILY,
+            PROFILER_HBASE_TABLE_PROVIDER, PROFILER_PERIOD,
+            PROFILER_PERIOD_UNITS, PROFILER_SALT_DIVISOR};
+
+    // ensure the required capabilities are defined
+    final Context.Capabilities[] required = { GLOBAL_CONFIG };
+    validateCapabilities(context, required);
+    @SuppressWarnings("unchecked")
+    Map<String, Object> global = (Map<String, Object>) context.getCapability(GLOBAL_CONFIG).get();
+
+    Map<String, Object> result = new HashMap<String, Object>(6);
+    Object v;
+
+    // extract the relevant parameters from global
+    for (String k : KEYLIST) {
+      v = global.get(k);
+      if (v != null) result.put(k, v);
+    }
+    if (configOverridesMap == null) return result;
+
+    // extract override values, typechecking as we go
+    try {
+      for (Object key : configOverridesMap.keySet()) {
+        if (!(key instanceof String)) {
+          // Probably unintended user error, so throw an exception rather than ignore
+          throw new ParseException("Non-string key in config_overrides map is not allowed:
" + key.toString());
+        }
+        switch ((String) key) {
+          case PROFILER_HBASE_TABLE:
+          case PROFILER_COLUMN_FAMILY:
+          case PROFILER_HBASE_TABLE_PROVIDER:
+          case PROFILER_PERIOD_UNITS:
+            v = configOverridesMap.get(key);
+            v = ConversionUtils.convert(v, String.class);
+            result.put((String) key, v);
+            break;
+          case PROFILER_PERIOD:
+          case PROFILER_SALT_DIVISOR:
+            // be tolerant if the user put a number instead of a string
+            // regardless, validate that it is an integer value
+            v = configOverridesMap.get(key);
+            long vlong = ConversionUtils.convert(v, Long.class);
+            result.put((String) key, String.valueOf(vlong));
+            break;
+          default:
+            LOG.warn("Ignoring unallowed key {} in config_overrides map.", key);
+            break;
+        }
+      }
+    } catch (ClassCastException | NumberFormatException cce) {
+      throw new ParseException("Type violation in config_overrides map values: ", cce);
+    }
+    return result;
+  }
+
+  /**
    * Get the groups defined by the user.
    *
    * The user can specify 0 or more groups.  All arguments from the specified position
@@ -244,16 +369,10 @@ public class GetProfile implements StellarFunction {
    * @param global The global configuration.
    */
   private ColumnBuilder getColumnBuilder(Map<String, Object> global) {
-    // the builder is not currently configurable - but should be made so
     ColumnBuilder columnBuilder;
 
-    if(global.containsKey(PROFILER_COLUMN_FAMILY)) {
-      String columnFamily = (String) global.get(PROFILER_COLUMN_FAMILY);
-      columnBuilder = new ValueOnlyColumnBuilder(columnFamily);
-
-    } else {
-      columnBuilder = new ValueOnlyColumnBuilder();
-    }
+    String columnFamily = (String) global.getOrDefault(PROFILER_COLUMN_FAMILY, PROFILER_COLUMN_FAMILY_DEFAULT);
+    columnBuilder = new ValueOnlyColumnBuilder(columnFamily);
 
     return columnBuilder;
   }
@@ -289,7 +408,7 @@ public class GetProfile implements StellarFunction {
    */
   private HTableInterface getTable(Map<String, Object> global) {
 
-    String tableName = (String) global.getOrDefault(PROFILER_HBASE_TABLE, "profiler");
+    String tableName = (String) global.getOrDefault(PROFILER_HBASE_TABLE, PROFILER_HBASE_TABLE_DEFAULT);
     TableProvider provider = getTableProvider(global);
 
     try {

http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java
----------------------------------------------------------------------
diff --git a/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java
b/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java
index 4bb3420..960795b 100644
--- a/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java
+++ b/metron-analytics/metron-profiler-client/src/test/java/org/apache/metron/profiler/client/GetProfileTest.java
@@ -69,6 +69,10 @@ public class GetProfileTest {
   private StellarExecutor executor;
   private Map<String, Object> state;
   private ProfileWriter profileWriter;
+  // different values of period and salt divisor, used to test config_overrides feature
+  private static final long periodDuration2 = 1;
+  private static final TimeUnit periodUnits2 = TimeUnit.HOURS;
+  private static final int saltDivisor2 = 2050;
 
   /**
    * A TableProvider that allows us to mock HBase.
@@ -87,6 +91,17 @@ public class GetProfileTest {
     return executor.execute(expression, state, clazz);
   }
 
+  /**
+   * This method sets up the configuration context for both writing profile data
+   * (using profileWriter to mock the complex process of what the Profiler topology
+   * actually does), and then reading that profile data (thereby testing the PROFILE_GET
+   * Stellar client implemented in GetProfile).
+   *
+   * It runs at @Before time, and sets testclass global variables used by the writers and
readers.
+   * The various writers and readers are in each test case, not here.
+   *
+   * @return void
+   */
   @Before
   public void setup() {
     state = new HashMap<>();
@@ -117,6 +132,51 @@ public class GetProfileTest {
   }
 
   /**
+   * This method is similar to setup(), in that it sets up profiler configuration context,
+   * but only for the client.  Additionally, it uses periodDuration2, periodUnits2
+   * and saltDivisor2, instead of periodDuration, periodUnits and saltDivisor respectively.
+   *
+   * This is used in the unit tests that test the config_overrides feature of PROFILE_GET.
+   * In these tests, the context from @Before setup() is used to write the data, then the
global
+   * context is changed to context2 (from this method).  Each test validates that a default
read
+   * using global context2 then gets no valid results (as expected), and that a read using
+   * original context values in the PROFILE_GET config_overrides argument gets all expected
results.
+   *
+   * @return context2 - The profiler client configuration context created by this method.
+   *    The context2 values are also set in the configuration of the StellarExecutor
+   *    stored in the global variable 'executor'.  However, there is no API for querying
the
+   *    context values from a StellarExecutor, so we output the context2 Context object itself,
+   *    for validation purposes (so that its values can be validated as being significantly
+   *    different from the setup() settings).
+   */
+  private Context setup2() {
+    state = new HashMap<>();
+
+    // global properties
+    Map<String, Object> global = new HashMap<String, Object>() {{
+      put(PROFILER_HBASE_TABLE, tableName);
+      put(PROFILER_COLUMN_FAMILY, columnFamily);
+      put(PROFILER_HBASE_TABLE_PROVIDER, MockTableProvider.class.getName());
+      put(PROFILER_PERIOD, Long.toString(periodDuration2));
+      put(PROFILER_PERIOD_UNITS, periodUnits2.toString());
+      put(PROFILER_SALT_DIVISOR, Integer.toString(saltDivisor2));
+    }};
+
+    // create the modified context
+    Context context2 = new Context.Builder()
+            .with(Context.Capabilities.GLOBAL_CONFIG, () -> global)
+            .build();
+
+    // create the stellar execution environment
+    executor = new DefaultStellarExecutor(
+            new SimpleFunctionResolver()
+                    .withClass(GetProfile.class),
+            context2);
+
+    return context2; //because there is no executor.getContext() method
+  }
+
+  /**
    * Values should be retrievable that have NOT been stored within a group.
    */
   @Test
@@ -168,12 +228,19 @@ public class GetProfileTest {
     state.put("groups", group);
 
     // execute - read the profile values
-    String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekends')";
+    String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekends'])";
     @SuppressWarnings("unchecked")
     List<Integer> result = run(expr, List.class);
 
     // validate - expect to read all values from the past 4 hours
     Assert.assertEquals(count, result.size());
+
+    // test the deprecated but allowed "varargs" form of groups specification
+    expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekends')";
+    result = run(expr, List.class);
+
+    // validate - expect to read all values from the past 4 hours
+    Assert.assertEquals(count, result.size());
   }
 
   /**
@@ -199,12 +266,19 @@ public class GetProfileTest {
     state.put("groups", group);
 
     // execute - read the profile values
-    String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekdays', 'tuesday')";
+    String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekdays', 'tuesday'])";
     @SuppressWarnings("unchecked")
     List<Integer> result = run(expr, List.class);
 
     // validate - expect to read all values from the past 4 hours
     Assert.assertEquals(count, result.size());
+
+    // test the deprecated but allowed "varargs" form of groups specification
+    expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', 'weekdays', 'tuesday')";
+    result = run(expr, List.class);
+
+    // validate - expect to read all values from the past 4 hours
+    Assert.assertEquals(count, result.size());
   }
 
   /**
@@ -254,4 +328,108 @@ public class GetProfileTest {
     // validate - there should be no values from only 4 seconds ago
     Assert.assertEquals(0, result.size());
   }
+
+  /**
+   * Values should be retrievable that were written with configuration different than current
global config.
+   */
+  @Test
+  public void testWithConfigOverride() {
+    final int periodsPerHour = 4;
+    final int expectedValue = 2302;
+    final int hours = 2;
+    final long startTime = System.currentTimeMillis() - TimeUnit.HOURS.toMillis(hours);
+    final List<Object> group = Collections.emptyList();
+
+    // setup - write some measurements to be read later
+    final int count = hours * periodsPerHour;
+    ProfileMeasurement m = new ProfileMeasurement()
+            .withProfileName("profile1")
+            .withEntity("entity1")
+            .withPeriod(startTime, periodDuration, periodUnits);
+    profileWriter.write(m, count, group, val -> expectedValue);
+
+    // now change the executor configuration
+    Context context2 = setup2();
+    // validate it is changed in significant way
+    @SuppressWarnings("unchecked")
+    Map<String, Object> global = (Map<String, Object>) context2.getCapability(Context.Capabilities.GLOBAL_CONFIG).get();
+    Assert.assertEquals(global.get(PROFILER_PERIOD), Long.toString(periodDuration2));
+    Assert.assertNotEquals(periodDuration, periodDuration2);
+
+    // execute - read the profile values - with (wrong) default global config values.
+    // No error message at this time, but returns empty results list, because
+    // row keys are not correctly calculated.
+    String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS')";
+    @SuppressWarnings("unchecked")
+    List<Integer> result = run(expr, List.class);
+
+    // validate - expect to fail to read any values
+    Assert.assertEquals(0, result.size());
+
+    // execute - read the profile values - with config_override.
+    // first two override values are strings, third is deliberately a number.
+    expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', [], {"
+            + "'profiler.client.period.duration' : '" + periodDuration + "', "
+            + "'profiler.client.period.duration.units' : '" + periodUnits.toString() + "',
"
+            + "'profiler.client.salt.divisor' : " + saltDivisor + " })";
+    result = run(expr, List.class);
+
+    // validate - expect to read all values from the past 4 hours
+    Assert.assertEquals(count, result.size());
+  }
+
+  /**
+   * Values should be retrievable that have been stored within a 'group', with
+   * configuration different than current global config.
+   * This time put the config_override case before the non-override case.
+   */
+  @Test
+  public void testWithConfigAndOneGroup() {
+    final int periodsPerHour = 4;
+    final int expectedValue = 2302;
+    final int hours = 2;
+    final long startTime = System.currentTimeMillis() - TimeUnit.HOURS.toMillis(hours);
+    final List<Object> group = Arrays.asList("weekends");
+
+    // setup - write some measurements to be read later
+    final int count = hours * periodsPerHour;
+    ProfileMeasurement m = new ProfileMeasurement()
+            .withProfileName("profile1")
+            .withEntity("entity1")
+            .withPeriod(startTime, periodDuration, periodUnits);
+    profileWriter.write(m, count, group, val -> expectedValue);
+
+    // create a variable that contains the groups to use
+    state.put("groups", group);
+
+    // now change the executor configuration
+    Context context2 = setup2();
+    // validate it is changed in significant way
+    @SuppressWarnings("unchecked")
+    Map<String, Object> global = (Map<String, Object>) context2.getCapability(Context.Capabilities.GLOBAL_CONFIG).get();
+    Assert.assertEquals(global.get(PROFILER_PERIOD), Long.toString(periodDuration2));
+    Assert.assertNotEquals(periodDuration, periodDuration2);
+
+    // execute - read the profile values - with config_override.
+    // first two override values are strings, third is deliberately a number.
+    String expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekends'], {"
+            + "'profiler.client.period.duration' : '" + periodDuration + "', "
+            + "'profiler.client.period.duration.units' : '" + periodUnits.toString() + "',
"
+            + "'profiler.client.salt.divisor' : " + saltDivisor + " })";
+    @SuppressWarnings("unchecked")
+    List<Integer> result = run(expr, List.class);
+
+    // validate - expect to read all values from the past 4 hours
+    Assert.assertEquals(count, result.size());
+
+    // execute - read the profile values - with (wrong) default global config values.
+    // No error message at this time, but returns empty results list, because
+    // row keys are not correctly calculated.
+    expr = "PROFILE_GET('profile1', 'entity1', 4, 'HOURS', ['weekends'])";
+    result = run(expr, List.class);
+
+    // validate - expect to fail to read any values
+    Assert.assertEquals(0, result.size());
+  }
+
 }

http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-analytics/metron-profiler/README.md
----------------------------------------------------------------------
diff --git a/metron-analytics/metron-profiler/README.md b/metron-analytics/metron-profiler/README.md
index 2d347d2..04e1c0d 100644
--- a/metron-analytics/metron-profiler/README.md
+++ b/metron-analytics/metron-profiler/README.md
@@ -24,6 +24,12 @@ This section will describe the steps required to get your first profile
running.
     hbase(main):001:0> create 'profiler', 'P'
     ```
     
+1. Edit the configuration file located at `$METRON_HOME/config/profiler.properties`.  Change
the kafka.zk and kafka.broker values from "node1" to the appropriate host name.  Keep the
same port numbers:
+    ```
+    kafka.zk=node1:2181
+    kafka.broker=node1:6667
+    ```
+
 1. Define the profile in a file located at `$METRON_HOME/config/zookeeper/profiler.json`.
 The following example JSON will create a profile that simply counts the number of messages
per `ip_src_addr`, during each sampling interval.
     ```
     {
@@ -31,7 +37,7 @@ This section will describe the steps required to get your first profile
running.
         {
           "profile": "test",
           "foreach": "ip_src_addr",
-          "init":    { "count": 0 },
+          "init":    { "count": "0" },
           "update":  { "count": "count + 1" },
           "result":  "count"
         }
@@ -39,7 +45,7 @@ This section will describe the steps required to get your first profile
running.
     }
     ```
 
-1. Upload the profile definition to Zookeeper.
+1. Upload the profile definition to Zookeeper.  (As always, change "node1" to the actual
hostname.)
     ```
     $ cd $METRON_HOME
     $ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181
@@ -58,7 +64,9 @@ This section will describe the steps required to get your first profile
running.
     hbase(main):001:0> count 'profiler'
     ``` 
 
-1. Use the Profiler Client to read the profile data.  The below example `PROFILE_GET` command
will read data written by the sample profile given above, if 10.0.0.1 is one of the input
values for `ip_src_addr`.  More information on using the client can be found [here](../metron-profiler-client).
+1. Use the Profiler Client to read the profile data.  The below example `PROFILE_GET` command
will read data written by the sample profile given above, if 10.0.0.1 is one of the input
values for `ip_src_addr`.
+More information on configuring and using the client can be found [here](../metron-profiler-client).
+It is assumed that the PROFILE_GET client is correctly configured before using it.
     ```
     $ bin/stellar -z node1:2181
     
@@ -68,7 +76,9 @@ This section will describe the steps required to get your first profile
running.
 
 ## Creating Profiles
 
-The Profiler configuration requires a JSON-formatted set of elements, many of which can contain
Stellar code.  The configuration contains the following elements.  For the impatient, skip
ahead to the [Examples](#examples).
+The Profiler specification requires a JSON-formatted set of elements, many of which can contain
Stellar code.  The specification contains the following elements.  (For the impatient, skip
ahead to the [Examples](#examples).)
+The specification for the Profiler topology is stored in Zookeeper at  `/metron/topology/profiler`.
 These properties also exist in the local filesystem at `$METRON_HOME/config/zookeeper/profiler.json`.

+The values can be changed on disk and then uploaded to Zookeeper using `$METRON_HOME/bin/zk_load_configs.sh`.
 
 | Name 	                |               | Description 	
 |---	                |---	        |---
@@ -117,7 +127,7 @@ The 'groupBy' expressions can refer to any field within a `org.apache.metron.pro
 
 *Optional*
 
-One or more expressions executed at the start of a window period.  A map is expected where
the key is the variable name and the value is a Stellar expression.  The map can contain 0
or more variables/expressions. At the start of each window period the expression is executed
once and stored in a variable with the given name. 
+One or more expressions executed at the start of a window period.  A map is expected where
the key is the variable name and the value is a Stellar expression.  The map can contain zero
or more variable:expression pairs. At the start of each window period, each expression is
executed once and stored in the given variable. Note that constant init values such as "0"
must be in quotes regardless of their type, as the init value must be a string to be executed
by Stellar.
 
 ```
 "init": {
@@ -143,7 +153,7 @@ One or more expressions executed when a message is applied to the profile.
 A ma
 
 *Required*
 
-A Stellar expression that is executed when the window period expires.  The expression is
expected to summarize the messages that were applied to the profile over the window period.
 The expression must result in a numeric value such as a Double, Long, Float, Short, or Integer.
 	   
+A Stellar expression that is executed when the window period expires.  The expression is
expected to summarize the messages that were applied to the profile over the window period,
using the state accumulated by the updates.  The result will typically be a single numeric
value, but it may be any serializable object, as shown in Example 4 below.  	   
 
 ### `expires`
 
@@ -153,7 +163,8 @@ A numeric value that defines how many days the profile data is retained.
 After
 
 ## Configuring the Profiler
 
-The Profiler runs as an independent Storm topology.  The configuration for the Profiler topology
is stored in Zookeeper at  `/metron/topology/profiler`.  These properties also exist in the
the default installation of Metron at `$METRON_HOME/config/zookeeper/profiler.json`. The values
can be changed on disk and then uploaded to Zookeeper using `$METRON_HOME/bin/zk_load_configs.sh`.
+The Profiler runs as an independent Storm topology.  The configuration for the Profiler topology
is stored in local filesystem at `$METRON_HOME/config/profiler.properties`. 
+The values can be changed on disk and then the Profiler topology must be restarted.
 
 | Setting   | Description   |
 |---        |---            |
@@ -300,7 +311,7 @@ This creates a profile...
 
 It is important to note that the Profiler can persist any serializable Object, not just numeric
values.  An alternative to the previous example could take advantage of this.  
 
-Instead of storing the mean of the length, the profile could store a more generic summary
of the length.  This summary can then be used at a later time to calculate the mean, min,
max, percentiles, or any other sensible metric.  This provides a much greater degree of flexibility.
+Instead of storing the mean of the lengths, the profile could store a statistical summarization
of the lengths.  This summary can then be used at a later time to calculate the mean, min,
max, percentiles, or any other sensible metric.  This provides a much greater degree of flexibility.
  
 ```
 {
@@ -316,7 +327,8 @@ Instead of storing the mean of the length, the profile could store a more
generi
 }
 ``` 
 
-The following Stellar REPL session shows how you might use this summary to calculate different
metrics with the same underlying profile data.  
+The following Stellar REPL session shows how you might use this summary to calculate different
metrics with the same underlying profile data.
+It is assumed that the PROFILE_GET client is configured as described [here](../metron-profiler-client).
 
 Retrieve the last 30 minutes of profile measurements for a specific host.
 ```

http://git-wip-us.apache.org/repos/asf/incubator-metron/blob/9ec2cdcd/metron-platform/metron-common/README.md
----------------------------------------------------------------------
diff --git a/metron-platform/metron-common/README.md b/metron-platform/metron-common/README.md
index 019c538..cbf0180 100644
--- a/metron-platform/metron-common/README.md
+++ b/metron-platform/metron-common/README.md
@@ -413,8 +413,9 @@ MAP_GET`
     * entity - The name of the entity.
     * durationAgo - How long ago should values be retrieved from?
     * units - The units of 'durationAgo'.
-    * groups - Optional - The groups used to sort the profile.
-  * Returns: The profile measurements.
+    * groups_list - Optional, must correspond to the 'groupBy' list used in profile creation
- List (in square brackets) of groupBy values used to filter the profile. Default is the empty
list, meaning groupBy was not used when creating the profile.
+    * config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding
the global config parameter of the same name. Default is the empty Map, meaning no overrides.
+  * Returns: The selected profile measurements.
 
 ### `PROTOCOL_TO_NAME`
   * Description: Converts the IANA protocol number to the protocol name



Mime
View raw message