metron-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mmiklav...@apache.org
Subject [metron] branch master updated: METRON-1795: General Purpose Regex Parser (jadeepsinh2 via mmiklavc) closes apache/metron#1245
Date Mon, 17 Dec 2018 16:52:01 GMT
This is an automated email from the ASF dual-hosted git repository.

mmiklavcic pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/metron.git


The following commit(s) were added to refs/heads/master by this push:
     new b8e426c  METRON-1795: General Purpose Regex Parser (jadeepsinh2 via mmiklavc) closes
apache/metron#1245
b8e426c is described below

commit b8e426c755a5969e24dba50f5d8fa81d1ccb472d
Author: jagdeepsingh2 <jagdeep.singh.2@team.telstra.com>
AuthorDate: Mon Dec 17 09:44:50 2018 -0700

    METRON-1795: General Purpose Regex Parser (jadeepsinh2 via mmiklavc) closes apache/metron#1245
---
 metron-platform/metron-parsers/README.md           |  90 +++++
 .../parsers/regex/RegularExpressionsParser.java    | 435 +++++++++++++++++++++
 .../regex/RegularExpressionsParserTest.java        | 275 +++++++++++++
 3 files changed, 800 insertions(+)

diff --git a/metron-platform/metron-parsers/README.md b/metron-platform/metron-parsers/README.md
index cfcf6ed..5aff84a 100644
--- a/metron-platform/metron-parsers/README.md
+++ b/metron-platform/metron-parsers/README.md
@@ -52,6 +52,96 @@ There are two general types types of parsers:
        This is using the default value for `wrapEntityName` if that property is not set.
     * `wrapEntityName` : Sets the name to use when wrapping JSON using `wrapInEntityArray`.
 The `jsonpQuery` should reference this name.
     * A field called `timestamp` is expected to exist and, if it does not, then current time
is inserted.  
+  * Regular Expressions Parser
+      * `recordTypeRegex` : A regular expression to uniquely identify a record type.
+      * `messageHeaderRegex` : A regular expression used to extract fields from a message
part which is common across all the messages.
+      * `convertCamelCaseToUnderScore` : If this property is set to true, this parser will
automatically convert all the camel case property names to underscore seperated. 
+          For example, following convertions will automatically happen:
+
+          ```
+          ipSrcAddr -> ip_src_addr
+          ipDstAddr -> ip_dst_addr
+          ipSrcPort -> ip_src_port
+          ```
+          Note this property may be necessary, because java does not support underscores
in the named group names. So in case your property naming conventions requires underscores
in property names, use this property.
+          
+      * `fields` : A json list of maps contaning a record type to regular expression mapping.
+      
+      A complete configuration example would look like:
+      
+      ```json
+      "convertCamelCaseToUnderScore": true, 
+      "recordTypeRegex": "kernel|syslog",
+      "messageHeaderRegex": "(<syslogPriority>(<=^&lt;)\\d{1,4}(?=>)).*?(<timestamp>(<=>)[A-Za-z]
{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(<syslogHost>(<=\\s).*?(?=\\s))",
+      "fields": [
+        {
+          "recordType": "kernel",
+          "regex": ".*(<eventInfo>(<=\\]|\\w\\:).*?(?=$))"
+        },
+        {
+          "recordType": "syslog",
+          "regex": ".*(<processid>(<=PID\\s=\\s).*?(?=\\sLine)).*(<filePath>(<=64\\s)\/([A-Za-z0-9_-]+\/)+(?=\\w))
       (<fileName>.*?(?=\")).*(<eventInfo>(<=\").*?(?=$))"
+        }
+      ]
+      ```
+      **Note**: messageHeaderRegex and regex (withing fields) could be specified as lists
also e.g.
+      ```json
+          "messageHeaderRegex": [
+          "regular expression 1",
+          "regular expression 2"
+          ]
+      ```
+      Where **regular expression 1** are valid regular expressions and may have named
+      groups, which would be extracted into fields. This list will be evaluated in order
until a
+      matching regular expression is found.
+      
+      **messageHeaderRegex** is run on all the messages.
+      Yes, all the messages are expected to contain the fields which are being extracted
using the **messageHeaderRegex**.
+      **messageHeaderRegex** is a sort of HCF (highest common factor) in all messages.
+      
+      **recordTypeRegex** can be a more advanced regular expression containing named goups.
For example
+  
+      "recordTypeRegex": "(&lt;process&gt;(<=\\s)\\b(kernel|syslog)\\b(?=\\[|:))"
+      
+      Here all the named goups (process in above example) will be extracted as fields.
+
+      Though having named group in recordType is completely optional, still one could want
extract named groups in recordType for following reasons:
+
+      1. Since **recordType** regular expression is already getting matched and we are paying
the price for a regular expression match already,
+      we can extract certain fields as a by product of this match.
+      2. Most likely the **recordType** field is common across all the messages. Hence having
it extracted in the recordType (or messageHeaderRegex) would
+      reduce the overall complexity of regular expressions in the regex field.
+      
+      **regex** within a field could be a list of regular expressions also. In this case
all regular expressions in the list will be attempted to match until a match is found. Once
a full match is found remaining regular expressions are ignored.
+  
+      ```json
+          "regex":  [ "record type specific regular expression 1",
+                      "record type specific regular expression 2"]
+
+      ```
+
+      **timesamp**
+
+      Since this parser is a general purpose parser, it will populate the timestamp field
with current UTC timestamp. Actual timestamp value can be overridden later using stellar.
+      For example in case of syslog timestamps, one could use following stellar construct
to override the timestamp value.
+      Let us say you parsed actual timestamp from the raw log:
+
+      <38>Jun 20 15:01:17 hostName sshd[11672]: Accepted publickey for prod from 55.55.55.55
port 66666 ssh2
+
+      syslogTimestamp="Jun 20 15:01:17"
+
+      Then something like below could be used to override the timestamp.
+
+      ```
+      "timestamp_str": "FORMAT('%s%s%s', YEAR(),' ',syslogTimestamp)",
+      "timestamp":"TO_EPOCH_TIMESTAMP(timestamp_str, 'yyyy MMM dd HH:mm:ss' )"
+      ```
+
+      OR, if you want to factor in the timezone
+
+      ```
+      "timestamp":"TO_EPOCH_TIMESTAMP(timestamp_str, timestamp_format, timezone_name )"
+      ```
 
 ## Parser Error Routing
 
diff --git a/metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/regex/RegularExpressionsParser.java
b/metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/regex/RegularExpressionsParser.java
new file mode 100644
index 0000000..c9f1ec9
--- /dev/null
+++ b/metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/regex/RegularExpressionsParser.java
@@ -0,0 +1,435 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information
regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache License, Version
2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain
a
+ * copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software distributed under
the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
express
+ * or implied. See the License for the specific language governing permissions and limitations
under
+ * the License.
+ */
+
+package org.apache.metron.parsers.regex;
+
+import com.google.common.base.CaseFormat;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.metron.common.Constants;
+import org.apache.metron.parsers.BasicParser;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.Charset;
+import java.util.*;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+import java.util.regex.PatternSyntaxException;
+
+//@formatter:off
+/**
+ * General purpose class to parse unstructured text message into a json object. This class
parses
+ * the message as per supplied parser config as part of sensor config. Sensor parser config
example:
+ *
+ * <pre>
+ * <code>
+ * "convertCamelCaseToUnderScore": true,
+ * "recordTypeRegex": "(?&lt;process&gt;(?&lt;=\\s)\\b(kernel|syslog)\\b(?=\\[|:))",
+ * "messageHeaderRegex": "(?&lt;syslogpriority&gt;(?&lt;=^&lt;)\\d{1,4}(?=&gt;)).*?(?&lt;timestamp>(?&lt;=&gt;)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?&lt;syslogHost&gt;(?&lt;=\\s).*?(?=\\s))",
+ * "fields": [
+ * {
+ * "recordType": "kernel",
+ * "regex": ".*(?&lt;eventInfo&gt;(?&lt;=\\]|\\w\\:).*?(?=$))"
+ * },
+ * {
+ * "recordType": "syslog",
+ * "regex": ".*(?&lt;processid&gt;(?&lt;=PID\\s=\\s).*?(?=\\sLine)).*(?&lt;filePath&gt;(?&lt;=64\\s)\/([A-Za-z0-9_-]+\/)+(?=\\w))(?&lt;fileName&gt;.*?(?=\")).*(?&lt;eventInfo&gt;(?&lt;=\").*?(?=$))"
+ * }
+ * ]
+ * </code>
+ * </pre>
+ *
+ * Note: messageHeaderRegex could be specified as lists also e.g.
+ *
+ * <pre>
+ * <code>
+ * "messageHeaderRegex": [
+ * "regular expression 1",
+ * "regular expression 2"
+ * ]
+ * </code>
+ * </pre>
+ *
+ * Where <strong>regular expression 1</strong> are valid regular expressions
and may have named
+ * groups, which would be extracted into fields. This list will be evaluated in order until
a
+ * matching regular expression is found.<br>
+ * <br>
+ *
+ * <strong>Configuration fields explanation</strong>
+ *
+ * <pre>
+ * recordTypeRegex : used to specify a regular expression to distinctly identify a record
type.
+ * messageHeaderRegex :  used to specify a regular expression to extract fields from a message
part which is common across all the messages.
+ * e.g. rhel logs looks like
+ * <code>
+ * <7>Jun 26 16:18:01 hostName kernel: SELinux: initialized (dev tmpfs, type tmpfs),
uses transition SIDs
+ * </code>
+ * <br>
+ * </pre>
+ *
+ * Here message structure (<7>Jun 26 16:18:01 hostName kernel) is common across all
messages.
+ * Hence messageHeaderRegex could be used to extract fields from this part.
+ *
+ * fields : json list of objects containing recordType and regex. regex could be a further
list e.g.
+ *
+ * <pre>
+ * <code>
+ * "regex":  [ "record type specific regular expression 1",
+ *             "record type specific regular expression 2"]
+ *
+ * </code>
+ * </pre>
+ *
+ * <strong>Limitation</strong> <br>
+ * Currently the named groups in java regular expressions have a limitation. Only following
+ * characters could be used to name a named group. A capturing group can also be assigned
a "name",
+ * a named-capturing group, and then be back-referenced later by the "name". Group names
are
+ * composed of the following characters. The first character must be a letter.
+ *
+ * <pre>
+ * <code>
+ * The uppercase letters 'A' through 'Z' ('\u0041' through '\u005a'),
+ * The lowercase letters 'a' through 'z' ('\u0061' through '\u007a'),
+ * The digits '0' through '9' ('\u0030' through '\u0039'),
+ * </code>
+ * </pre>
+ *
+ * This means that an _ (underscore), cannot be used as part of a named group name. E.g.
this is an
+ * invalid regular expression <code>.*(?&lt;event_info&gt;(?&lt;=\\]|\\w\\:).*?(?=$))</code>
+ *
+ * However, this limitation can be easily overcome by adding a parser configuration setting.
+ *
+ * <code>
+ *  "convertCamelCaseToUnderScore": true,
+ * <code>
+ * If above property is added to the sensor parser configuration, in parserConfig object,
this parser will automatically convert all the camel case property names to underscore seperated.
+ * For example, following conversions will automatically happen:
+ *
+ * <code>
+ * ipSrcAddr -> ip_src_addr
+ * ipDstAddr -> ip_dst_addr
+ * ipSrcPort -> ip_src_port
+ * <code>
+ * etc.
+ */
+//@formatter:on
+public class RegularExpressionsParser extends BasicParser {
+
+    protected static final Logger LOG =
+        LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+    private static final Charset UTF_8 = Charset.forName("UTF-8");
+
+    private List<Map<String, Object>> fields;
+    private Map<String, Object> parserConfig;
+    private final Pattern namedGroupPattern = Pattern.compile("\\(\\?<([a-zA-Z][a-zA-Z0-9]*)>");
+    Pattern capitalLettersPattern = Pattern.compile("^.*[A-Z]+.*$");
+    private Pattern recordTypePattern;
+    private final Set<String> recordTypePatternNamedGroups = new HashSet<>();
+    private final Map<String, Map<Pattern, Set<String>>> recordTypePatternMap
=
+        new LinkedHashMap<>();
+    private final Map<Pattern, Set<String>> messageHeaderPatternsMap = new LinkedHashMap<>();
+
+    /**
+     * Parses an unstructured text message into a json object based upon the regular expression
+     * configuration supplied.
+     *
+     * @param rawMessage incoming unstructured raw text.
+     * @return List of json parsed json objects. In this case list will have a single element
only.
+     */
+    @Override
+    public List<JSONObject> parse(byte[] rawMessage) {
+        String originalMessage = null;
+        try {
+            originalMessage = new String(rawMessage, UTF_8).trim();
+            LOG.debug(" raw message. {}", originalMessage);
+            if (originalMessage.isEmpty()) {
+                LOG.warn("Message is empty.");
+                return Arrays.asList(new JSONObject());
+            }
+        } catch (Exception e) {
+            LOG.error("[Metron] Could not read raw message. {} " + originalMessage, e);
+            throw new RuntimeException(e.getMessage(), e);
+        }
+
+        JSONObject parsedJson = new JSONObject();
+        if (messageHeaderPatternsMap.size() > 0) {
+            parsedJson.putAll(extractHeaderFields(originalMessage));
+        }
+        parsedJson.putAll(parse(originalMessage));
+        parsedJson.put(Constants.Fields.ORIGINAL.getName(), originalMessage);
+        /**
+         * Populate the output json with default timestamp.
+         */
+        parsedJson.put(Constants.Fields.TIMESTAMP.getName(), System.currentTimeMillis());
+        applyFieldTransformations(parsedJson);
+        return Arrays.asList(parsedJson);
+    }
+
+    private void applyFieldTransformations(JSONObject parsedJson) {
+        if (getParserConfig().get(ParserConfigConstants.CONVERT_CAMELCASE_TO_UNDERSCORE.getName())
+            != null && (Boolean) getParserConfig()
+            .get(ParserConfigConstants.CONVERT_CAMELCASE_TO_UNDERSCORE.getName())) {
+            convertCamelCaseToUnderScore(parsedJson);
+        }
+
+    }
+
+    // @formatter:off
+  /**
+   * This method is called during the parser initialization. It parses the parser
+   * configuration and configures the parser accordingly. It then initializes
+   * instance variables.
+   *
+   * @param parserConfig ParserConfig(Map<String, Object>) supplied to the sensor.
+   * @see org.apache.metron.parsers.interfaces.Configurable#configure(java.util.Map)<br>
+   *      <br>
+   */
+  // @formatter:on
+  @Override
+  public void configure(Map<String, Object> parserConfig) {
+      setParserConfig(parserConfig);
+      setFields((List<Map<String, Object>>) getParserConfig()
+          .get(ParserConfigConstants.FIELDS.getName()));
+      String recordTypeRegex =
+          (String) getParserConfig().get(ParserConfigConstants.RECORD_TYPE_REGEX.getName());
+
+      if (StringUtils.isBlank(recordTypeRegex)) {
+          LOG.error("Invalid config :recordTypeRegex is missing in parserConfig");
+          throw new IllegalStateException(
+              "Invalid config :recordTypeRegex is missing in parserConfig");
+      }
+
+      setRecordTypePattern(recordTypeRegex);
+      recordTypePatternNamedGroups.addAll(getNamedGroups(recordTypeRegex));
+      List<Map<String, Object>> fields =
+          (List<Map<String, Object>>) getParserConfig().get(ParserConfigConstants.FIELDS.getName());
+
+      try {
+          configureRecordTypePatterns(fields);
+          configureMessageHeaderPattern();
+      } catch (PatternSyntaxException e) {
+          LOG.error("Invalid config : {} ", e.getMessage());
+          throw new IllegalStateException("Invalid config : " + e.getMessage());
+      }
+
+      validateConfig();
+  }
+
+    private void configureMessageHeaderPattern() {
+        if (getParserConfig().get(ParserConfigConstants.MESSAGE_HEADER.getName()) != null)
{
+            if (getParserConfig()
+                .get(ParserConfigConstants.MESSAGE_HEADER.getName()) instanceof List) {
+                List<String> messageHeaderPatternList = (List<String>) getParserConfig()
+                    .get(ParserConfigConstants.MESSAGE_HEADER.getName());
+                for (String messageHeaderPatternStr : messageHeaderPatternList) {
+                    messageHeaderPatternsMap.put(Pattern.compile(messageHeaderPatternStr),
+                        getNamedGroups(messageHeaderPatternStr));
+                }
+            } else if (getParserConfig()
+                .get(ParserConfigConstants.MESSAGE_HEADER.getName()) instanceof String) {
+                String messageHeaderPatternStr =
+                    (String) getParserConfig().get(ParserConfigConstants.MESSAGE_HEADER.getName());
+                if (StringUtils.isNotBlank(messageHeaderPatternStr)) {
+                    messageHeaderPatternsMap.put(Pattern.compile(messageHeaderPatternStr),
+                        getNamedGroups(messageHeaderPatternStr));
+                }
+            }
+        }
+    }
+
+    private void configureRecordTypePatterns(List<Map<String, Object>> fields)
{
+
+        for (Map<String, Object> field : fields) {
+            if (field.get(ParserConfigConstants.RECORD_TYPE.getName()) != null
+                && field.get(ParserConfigConstants.REGEX.getName()) != null) {
+                String recordType =
+                    ((String) field.get(ParserConfigConstants.RECORD_TYPE.getName())).toLowerCase();
+                recordTypePatternMap.put(recordType, new LinkedHashMap<>());
+                if (field.get(ParserConfigConstants.REGEX.getName()) instanceof List) {
+                    List<String> regexList =
+                        (List<String>) field.get(ParserConfigConstants.REGEX.getName());
+                    regexList.forEach(s -> {
+                        recordTypePatternMap.get(recordType)
+                            .put(Pattern.compile(s), getNamedGroups(s));
+                    });
+                } else if (field.get(ParserConfigConstants.REGEX.getName()) instanceof String)
{
+                    recordTypePatternMap.get(recordType).put(
+                        Pattern.compile((String) field.get(ParserConfigConstants.REGEX.getName())),
+                        getNamedGroups((String) field.get(ParserConfigConstants.REGEX.getName())));
+                }
+            }
+        }
+    }
+
+    private void setRecordTypePattern(String recordTypeRegex) {
+        if (recordTypeRegex != null) {
+            recordTypePattern = Pattern.compile(recordTypeRegex);
+        }
+    }
+
+    private JSONObject parse(String originalMessage) {
+        JSONObject parsedJson = new JSONObject();
+        Optional<String> recordIdentifier = getField(recordTypePattern, originalMessage);
+        if (recordIdentifier.isPresent()) {
+            extractNamedGroups(parsedJson, recordIdentifier.get(), originalMessage);
+        }
+        /*
+         * Extract fields(named groups) from record type regular expression
+         */
+        Matcher matcher = recordTypePattern.matcher(originalMessage);
+        if (matcher.find()) {
+            for (String namedGroup : recordTypePatternNamedGroups) {
+                if (matcher.group(namedGroup) != null) {
+                    parsedJson.put(namedGroup, matcher.group(namedGroup).trim());
+                }
+            }
+        }
+        return parsedJson;
+    }
+
+    private void extractNamedGroups(Map<String, Object> json, String recordType,
+        String originalMessage) {
+        Map<Pattern, Set<String>> patternMap = recordTypePatternMap.get(recordType.toLowerCase());
+        if (patternMap != null) {
+            for (Map.Entry<Pattern, Set<String>> entry : patternMap.entrySet())
{
+                Pattern pattern = entry.getKey();
+                Set<String> namedGroups = entry.getValue();
+                if (pattern != null && namedGroups != null && namedGroups.size()
> 0) {
+                    Matcher m = pattern.matcher(originalMessage);
+                    if (m.matches()) {
+                        LOG.debug("RecordType : {} Trying regex : {} for message : {} ",
recordType,
+                            pattern.toString(), originalMessage);
+                        for (String namedGroup : namedGroups) {
+                            if (m.group(namedGroup) != null) {
+                                json.put(namedGroup, m.group(namedGroup).trim());
+                            }
+                        }
+                        break;
+                    }
+                }
+            }
+        } else {
+            LOG.warn("No pattern found for record type : {}", recordType);
+        }
+    }
+
+    public Optional<String> getField(Pattern pattern, String originalMessage) {
+        Matcher matcher = pattern.matcher(originalMessage);
+        while (matcher.find()) {
+            return Optional.of(matcher.group());
+        }
+        return Optional.empty();
+    }
+
+    private Set<String> getNamedGroups(String regex) {
+        Set<String> namedGroups = new TreeSet<>();
+        Matcher matcher = namedGroupPattern.matcher(regex);
+        while (matcher.find()) {
+            namedGroups.add(matcher.group(1));
+        }
+        return namedGroups;
+    }
+
+    private Map<String, Object> extractHeaderFields(String originalMessage) {
+        Map<String, Object> messageHeaderJson = new JSONObject();
+        for (Map.Entry<Pattern, Set<String>> syslogPatternEntry : messageHeaderPatternsMap
+            .entrySet()) {
+            Matcher m = syslogPatternEntry.getKey().matcher(originalMessage);
+            if (m.find()) {
+                for (String namedGroup : syslogPatternEntry.getValue()) {
+                    if (StringUtils.isNotBlank(m.group(namedGroup))) {
+                        messageHeaderJson.put(namedGroup, m.group(namedGroup).trim());
+                    }
+                }
+                break;
+            }
+        }
+        return messageHeaderJson;
+    }
+
+    @Override
+    public void init() {
+        LOG.info("RegularExpressions parser initialised.");
+    }
+
+    public void validateConfig() {
+        if (getFields() == null) {
+            LOG.error("Invalid config :  fields is missing in parserConfig");
+            throw new IllegalStateException("Invalid config :fields is missing in parserConfig");
+        }
+    }
+
+    private void convertCamelCaseToUnderScore(Map<String, Object> json) {
+        Map<String, String> oldKeyNewKeyMap = new HashMap<>();
+        for (Map.Entry<String, Object> entry : json.entrySet()) {
+            if (capitalLettersPattern.matcher(entry.getKey()).matches()) {
+                oldKeyNewKeyMap.put(entry.getKey(),
+                    CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_UNDERSCORE, entry.getKey()));
+            }
+        }
+        oldKeyNewKeyMap.forEach((oldKey, newKey) -> json.put(newKey, json.remove(oldKey)));
+    }
+
+    public List<Map<String, Object>> getFields() {
+        return fields;
+    }
+
+    public void setFields(List<Map<String, Object>> fields) {
+        this.fields = fields;
+    }
+
+    public Map<String, Object> getParserConfig() {
+        return parserConfig;
+    }
+
+    public void setParserConfig(Map<String, Object> parserConfig) {
+        this.parserConfig = parserConfig;
+    }
+
+    enum ParserConfigConstants {
+        //@formatter:off
+    RECORD_TYPE("recordType"),
+    RECORD_TYPE_REGEX("recordTypeRegex"),
+    REGEX("regex"),
+    FIELDS("fields"),
+    MESSAGE_HEADER("messageHeaderRegex"),
+    CONVERT_CAMELCASE_TO_UNDERSCORE("convertCamelCaseToUnderScore");
+    //@formatter:on
+    private final String name;
+        private static Map<String, ParserConfigConstants> nameToField;
+
+        ParserConfigConstants(String name) {
+            this.name = name;
+        }
+
+        public String getName() {
+            return name;
+        }
+
+        static {
+            nameToField = new HashMap<>();
+            for (ParserConfigConstants f : ParserConfigConstants.values()) {
+                nameToField.put(f.getName(), f);
+            }
+        }
+
+        public static ParserConfigConstants fromString(String fieldName) {
+            return nameToField.get(fieldName);
+        }
+    }
+}
diff --git a/metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/regex/RegularExpressionsParserTest.java
b/metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/regex/RegularExpressionsParserTest.java
new file mode 100644
index 0000000..5097ec0
--- /dev/null
+++ b/metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/regex/RegularExpressionsParserTest.java
@@ -0,0 +1,275 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information
regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache License, Version
2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain
a
+ * copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software distributed under
the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
express
+ * or implied. See the License for the specific language governing permissions and limitations
under
+ * the License.
+ */
+package org.apache.metron.parsers.regex;
+
+import org.adrianwalker.multilinestring.Multiline;
+import org.json.simple.JSONObject;
+import org.json.simple.parser.JSONParser;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class RegularExpressionsParserTest {
+
+    private RegularExpressionsParser regularExpressionsParser;
+
+    @Before
+    public void setUp() throws Exception {
+        regularExpressionsParser = new RegularExpressionsParser();
+    }
+
+    //@formatter:off
+      /**
+       {
+          "convertCamelCaseToUnderScore": true,
+          "messageHeaderRegex": "(?<syslogpriority>(?<=^<)\\d{1,4}(?=>)).*?(?<timestampDeviceOriginal>(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?<deviceName>(?<=\\s).*?(?=\\s))",
+          "recordTypeRegex": "(?<dstProcessName>(?<=\\s)\\b(kesl|sshd|run-parts|kernel|vsftpd|ftpd|su)\\b(?=\\[|:))",
+          "fields": [
+            {
+              "recordType": "kesl",
+              "regex": ".*(?<eventInfo>(?<=\\:).*?(?=$))"
+            },
+            {
+              "recordType": "run-parts",
+              "regex": ".*(?<eventInfo>(?<=\\sparts).*?(?=$))"
+            },
+            {
+              "recordType": "sshd",
+              "regex": [
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*(?=:\\s)).*?(?<encryptionAlgorithm>(?<=:\\s).+?(?=\\s)).*(?<correlationId>(?<=\\s).+?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<appProtocol>(?<=Protocol:).*?(?=;)).*?(?<sshClient>(?<=Client:).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<appProtocol>(?<=\\]:).*?(?=:)).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=for)).*?(?<dstUserId>(?<=for).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=port)).*?(?<ipSrcPort>(?<=port).*?(?=\\s)).*?(?<appProtocol>(?<=\\s).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\]))]:\\s.*?(?<eventInfo>subsystem.*?(?=by\\suser)).*?(?<srcUserId>(?<=user).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<action>(?<=Received).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=:)).*?(?<eventInfo>(?<=11:).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Server\\slistening(?=\\s)).*?(?<ipSrcAddr>(?<=\\son\\s).*?(?=port)).*?(?<ipSrcPort>(?<=port\\s)\\d{1,6}(?=\\.)).*$",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Invalid
user(?=\\s)).*?(?<dstUserId>(?<=\\s).*?(?=from)).*?(?<ipSrcAddr>(?<=from\\s).*(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=for)).*?(?<dstUserId>(?<=\\sfor).*?(?=\\[)).*?(?<subProcess>(?<=\\[).*?(?=\\])).*$",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:\\s)Excess
permission or bad ownership on file(?=\\s\\/)).*?(?<filePath>(?<=\\s).*(?=\\/)).*?(?<fileName>(?<=\\/).*(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=;)).*$",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=\\d)).*$",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=$))"
+              ]
+            },
+            {
+              "recordType": "kernel",
+              "regex": [
+                ".*(?<connectedDeviceName>(?<=\\:\\susb).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))",
+                ".*(?<subProcess>(?<=\\:\\s).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))"
+              ]
+            },
+            {
+              "recordType": "vsftpd",
+              "regex": ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=user=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<dstUserId>(?<=user=).*?(?=$))"
+            },
+            {
+              "recordType": "ftpd",
+              "regex": [
+                ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=FROM)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))",
+                ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=from)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))"
+              ]
+            },
+            {
+              "recordType": "su",
+              "regex": [
+                ".*(?<eventInfo>(?<=:\\s).*(?=for)).*(?<dstUserId>(?<=user=).*?(?=to)).*(?<responseCode>(?<=to).*?(?=$))"
+              ]
+            }
+          ]
+      }
+      */
+    @Multiline
+    public static String parserConfig1;
+    //@formatter:on
+
+
+    @Test
+    public void testSSHDParse() throws Exception {
+        String message =
+            "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey for prod
from 22.22.22.22 port 55555 ssh2";
+
+        JSONObject parserConfig = (JSONObject) new JSONParser().parse(parserConfig1);
+        regularExpressionsParser.configure(parserConfig);
+        JSONObject parsed = parse(message);
+        // Expected
+        Map<String, Object> expectedJson = new HashMap<>();
+        Assert.assertEquals(parsed.get("device_name"), "deviceName");
+        Assert.assertEquals(parsed.get("dst_process_name"), "sshd");
+        Assert.assertEquals(parsed.get("dst_process_id"), "11672");
+        Assert.assertEquals(parsed.get("dst_user_id"), "prod");
+        Assert.assertEquals(parsed.get("ip_src_addr"), "22.22.22.22");
+        Assert.assertEquals(parsed.get("ip_src_port"), "55555");
+        Assert.assertEquals(parsed.get("app_protocol"), "ssh2");
+        Assert.assertEquals(parsed.get("original_string"),
+            "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey for prod
from 22.22.22.22 port 55555 ssh2");
+        Assert.assertTrue(parsed.containsKey("timestamp"));
+
+    }
+
+    //@formatter:off
+    /**
+    {
+    "convertCamelCaseToUnderScore": true,
+    "recordTypeRegex": "(?<dstProcessName>(?<=\\s)\\b(kesl|sshd|run-parts|kernel|vsftpd|ftpd|su)\\b(?=\\[|:))",
+    "fields": [
+      {
+        "recordType": "kesl",
+        "regex": ".*(?<eventInfo>(?<=\\:).*?(?=$))"
+      },
+      {
+        "recordType": "run-parts",
+        "regex": ".*(?<eventInfo>(?<=\\sparts).*?(?=$))"
+      },
+      {
+        "recordType": "sshd",
+        "regex": [
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*(?=:\\s)).*?(?<encryptionAlgorithm>(?<=:\\s).+?(?=\\s)).*(?<correlationId>(?<=\\s).+?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<appProtocol>(?<=Protocol:).*?(?=;)).*?(?<sshClient>(?<=Client:).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<appProtocol>(?<=\\]:).*?(?=:)).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=for)).*?(?<dstUserId>(?<=for).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=port)).*?(?<ipSrcPort>(?<=port).*?(?=\\s)).*?(?<appProtocol>(?<=\\s).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\]))]:\\s.*?(?<eventInfo>subsystem.*?(?=by\\suser)).*?(?<srcUserId>(?<=user).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<action>(?<=Received).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=:)).*?(?<eventInfo>(?<=11:).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Server\\slistening(?=\\s)).*?(?<ipSrcAddr>(?<=\\son\\s).*?(?=port)).*?(?<ipSrcPort>(?<=port\\s)\\d{1,6}(?=\\.)).*$",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Invalid
user(?=\\s)).*?(?<dstUserId>(?<=\\s).*?(?=from)).*?(?<ipSrcAddr>(?<=from\\s).*(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=for)).*?(?<dstUserId>(?<=\\sfor).*?(?=\\[)).*?(?<subProcess>(?<=\\[).*?(?=\\])).*$",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:\\s)Excess
permission or bad ownership on file(?=\\s\\/)).*?(?<filePath>(?<=\\s).*(?=\\/)).*?(?<fileName>(?<=\\/).*(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=;)).*$",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=\\d)).*$",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=$))"
+        ]
+      },
+      {
+        "recordType": "kernel",
+        "regex": [
+          ".*(?<connectedDeviceName>(?<=\\:\\susb).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))",
+          ".*(?<subProcess>(?<=\\:\\s).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))"
+        ]
+      },
+      {
+        "recordType": "vsftpd",
+        "regex": ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=user=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<dstUserId>(?<=user=).*?(?=$))"
+      },
+      {
+        "recordType": "ftpd",
+        "regex": [
+          ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=FROM)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))",
+          ".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=from)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))"
+        ]
+      },
+      {
+        "recordType": "su",
+        "regex": [
+          ".*(?<eventInfo>(?<=:\\s).*(?=for)).*(?<dstUserId>(?<=user=).*?(?=to)).*(?<responseCode>(?<=to).*?(?=$))"
+        ]
+      }
+    ]
+    }
+    */
+    @Multiline
+    public static String parserConfigNoMessageHeader;
+    //@formatter:on
+
+    @Test
+    public void testNoMessageHeaderRegex() throws Exception {
+        String message =
+            "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey for prod
from 22.22.22.22 port 55555 ssh2";
+        JSONObject parserConfig = (JSONObject) new JSONParser().parse(parserConfigNoMessageHeader);
+        regularExpressionsParser.configure(parserConfig);
+        JSONObject parsed = parse(message);
+        // Expected
+
+        Assert.assertEquals(parsed.get("dst_process_name"), "sshd");
+        Assert.assertEquals(parsed.get("dst_process_id"), "11672");
+        Assert.assertEquals(parsed.get("dst_user_id"), "prod");
+        Assert.assertEquals(parsed.get("ip_src_addr"), "22.22.22.22");
+        Assert.assertEquals(parsed.get("ip_src_port"), "55555");
+        Assert.assertEquals(parsed.get("app_protocol"), "ssh2");
+        Assert.assertEquals(parsed.get("original_string"),
+            "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey for prod
from 22.22.22.22 port 55555 ssh2");
+        Assert.assertTrue(parsed.containsKey("timestamp"));
+
+    }
+
+    //@formatter:off
+    /**
+        {
+            "messageHeaderRegex": "(?<syslog_priority>(?<=^<)\\d{1,4}(?=>)).*?(?<timestampDeviceOriginal>(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?<deviceName>(?<=\\s).*?(?=\\s))",
+            "recordTypeRegex": "(?<dstProcessName>(?<=\\s)\\b(tch-replicant|audispd|syslog)\\b(?=\\[|:))",
+            "fields": [
+                {
+                    "recordType": "syslog",
+                    "regex": ".*(?<dstProcessId>(?<=PID\\s=\\s).*?(?=\\sLine)).*"
+                }
+            ]
+        }
+    */
+    @Multiline
+    public static String invalidParserConfig;
+    //@formatter:on
+
+    @Test(expected = IllegalStateException.class)
+    public void testMalformedRegex() throws Exception {
+        String message =
+            "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey for prod
from 22.22.22.22 port 55555 ssh2";
+        JSONObject parserConfig = (JSONObject) new JSONParser().parse(invalidParserConfig);
+        regularExpressionsParser.configure(parserConfig);
+        parse(message);
+    }
+
+    //@formatter:off
+    /**
+        {
+            "messageHeaderRegex": "(?<syslog_priority>(?<=^<)\\d{1,4}(?=>)).*?(?<timestampDeviceOriginal>(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?<deviceName>(?<=\\s).*?(?=\\s))",
+            "fields": [
+                {
+                    "recordType": "syslog",
+                    "regex": ".*(?<dstProcessId>(?<=PID\\s=\\s).*?(?=\\sLine)).*"
+                }
+            ]
+        }
+    */
+    @Multiline
+    public static String noRecordTypeParserConfig;
+    //@formatter:on
+
+    @Test(expected = IllegalStateException.class)
+    public void testNoRecordTypeRegex() throws Exception {
+        String message =
+            "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey for prod
from 22.22.22.22 port 55555 ssh2";
+        JSONObject parserConfig = (JSONObject) new JSONParser().parse(noRecordTypeParserConfig);
+        regularExpressionsParser.configure(parserConfig);
+        parse(message);
+    }
+
+    private JSONObject parse(String message) throws Exception {
+        List<JSONObject> result = regularExpressionsParser.parse(message.getBytes());
+        if (result.size() > 0) {
+            return result.get(0);
+        }
+        throw new Exception("Could not parse : " + message);
+    }
+}


Mime
View raw message