hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5470) Differentiate exactly match with regex in yarn log CLI
Date Thu, 04 Aug 2016 16:41:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408087#comment-15408087
] 

Junping Du commented on YARN-5470:
----------------------------------

Thanks Xuan for contributing the patch. 

{noformat}
+        + "to get exact matched log files. Use \"ALL\" or \".*\"to "
+        + "fetch all the log files for the container. Specific -regex "
+        + "for using java regex to find matched log files.");
{noformat}
I think ".*" means to match all only under regular expression. Isn't it? For normal case,
we use "*" directly for wildcard, isn't it? So we can have several ways to fetch all: "-logFiles
ALL", "-logFiles *" and "-logFiles -regex .*".

{noformat}
   private List<String> getMatchedLogFiles(ContainerLogsRequest options,
-      Collection<String> candidate) throws IOException {
+      Collection<String> candidate, boolean useRegex) throws IOException {
     List<String> matchedFiles = new ArrayList<String>();
     List<String> filePattern = options.getLogTypes();
+    boolean fetchAll = fetchAllLogFiles(
+        filePattern.toArray(new String[filePattern.size()]));
     for (String file : candidate) {
-      if (isFileMatching(file, filePattern)) {
+      if (fetchAll) {
         matchedFiles.add(file);
       }
+      if (useRegex) {
+        if (isFileMatching(file, filePattern)) {
+          matchedFiles.add(file);
+        }
+      } else {
+        if (filePattern.contains(file)) {
+          matchedFiles.add(file);
+        }
+      }
     }
     return matchedFiles;
   }
{noformat}
This could add duplicated log file if user specify ALL and some specific log file. Why not
we use Set instead of List for log file to return? Do we need to fetch item in order or index?
If not, replacing it with SET then we can get rid of duplication issue and also do some quick
path like return candidate directly when fetchAll = true.

> Differentiate exactly match with regex in yarn log CLI
> ------------------------------------------------------
>
>                 Key: YARN-5470
>                 URL: https://issues.apache.org/jira/browse/YARN-5470
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-5470.1.patch, YARN-5470.2.patch
>
>
> Since YARN-5089, we support regular expression in YARN log CLI "-logFiles" option. However,
we should differentiate exactly match with regex match as user could put something like "system.out"
here which have different semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message