hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths
Date Tue, 15 Apr 2008 04:05:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588891#action_12588891
] 

Runping Qi commented on HADOOP-3162:
------------------------------------


I assume the two new methods you refer to are meant to be:
{code}
public static setInputPaths(JobConf job,   String commaSeparatedFilePaths);
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths);
{code}
They don't break backward compatibility.
The patch implemented then incorrectly.
The correct implementation should look like:
{code}
public static addInputPaths(JobConf job,   String commaSeparatedFilePaths) {
    // treat the comma in commaSeparatedFilePaths that are not enclosed by '{' and '}' as
separators
    // split commaSeparatedFilePaths into string arrays using those separators
    // Let Path [] paths be the array of paths created from those strings
    return setInputPaths(job, paths);
}
{code}


When you replace the code using the existing api with the one using the new api like:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
[code}
That is incorrect. The correct one should be:
{code}
-      grepJob.setInputPath(new Path(args[0]));
+      FileInputFormat.setInputPaths(grepJob, args[0]);
{code}
 

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, patch-3162.txt,
patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class throws an
exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message