hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerome Boulon (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-5087) Regex for Cmd parsing contains an error
Date Tue, 20 Jan 2009 23:57:59 GMT
Regex for Cmd parsing contains an error
---------------------------------------

                 Key: HADOOP-5087
                 URL: https://issues.apache.org/jira/browse/HADOOP-5087
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/chukwa
         Environment: HADOOP-4947 use regex to parse chukwa commands but there's an error
in the regex
the current regex is:
Pattern addCmdPattern = Pattern.compile("[aA][dD][dD]\\s+(\\S+)\\s+(\\S+)\\s+(.*\\S)?\\s*(\\d+)\\s*");
does not correctly parsed this valid checkpoint entry:
"ADD org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped
Syslog 0 /var/log/messages 114027"
Parsing result:
adaptorName org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped
dataType Syslog
params 0 /var/log/messages 11402
offset 7

Instead of:
adaptorName org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped
dataType Syslog
params 0 /var/log/messages 
offset 114027

The correct regex is: "[aA][dD][dD]\\s+(\\S+)\\s+(\\S+)\\s+(.*\\s)?\\s*(\\d+)\\s*"
Example of parsing: "ADD org.apache.hadoop.chukwa.datacollection.adaptor.MySpecificAdaptor
Syslog 0 my param1 param2 /var/log/messages 114027";
Parsing result:
adaptorName org.apache.hadoop.chukwa.datacollection.adaptor.MySpecificAdaptor
dataType Syslog
params 0 my param1 param2 /var/log/messages 
offset 114027


            Reporter: Jerome Boulon
            Assignee: Jerome Boulon




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message