hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths
Date Tue, 15 Apr 2008 04:05:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588891#action_12588891

Runping Qi commented on HADOOP-3162:

I assume the two new methods you refer to are meant to be:
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths);
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
They don't break backward compatibility.
The patch implemented then incorrectly.
The correct implementation should look like:
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    return setInputPaths(job, paths);

When you replace the code using the existing api with the one using the new api like:
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
That is incorrect. The correct one should be:
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, args[0]);

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt,
> When a job is given a comma separated input file list, FileInputFormat class throws an
exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message