hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6052) KeyFieldBasedPartitioner would lost data if specifed field not exist
Date Wed, 17 Jun 2009 08:00:12 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720541#action_12720541
] 

Amar Kamat commented on HADOOP-6052:
------------------------------------

Following tests failed.
||Name||Type||Result||Resolution||
|org.apache.hadoop.mapred.TestReduceFetch|FAILED|Rerun also failed|HADOOP-6029|
|org.apache.hadoop.mapred.TestRunningTaskLimits|FAILED| Rerun passed|?|
|org.apache.hadoop.mapred.TestTaskLimits FAILED|(timeout)|Rerun also failed|HADOOP-5993/HADOOP-6061|


Looking at TestRunningTaskLimits, I see the following code
{code}

    JobConf jobConf = createWaitJobConf(mr, "job1", 20, 20);
    jobConf.setRunningMapLimit(5);
    jobConf.setRunningReduceLimit(3);
    
    // Submit the job
    RunningJob rJob = (new JobClient(jobConf)).submitJob(jobConf);
    
    // Wait 20 seconds for it to start up
    UtilsForTests.waitFor(20000);
    
    // Check the number of running tasks
    JobTracker jobTracker = mr.getJobTrackerRunner().getJobTracker();
    JobInProgress jip = jobTracker.getJob(rJob.getID());
    assertEquals(5, jip.runningMaps());
    assertEquals(3, jip.runningReduces());
{code}
I dont think waiting for 20 secs is a good thing to do. When I see the logs only one reducer
was scheduled.

Contrib tests passed except 
||Name||Type||Result||Resolution||
|org.apache.hadoop.streaming.TestStreamingExitStatus|FAILED|Known issue|HADOOP-5906|
|org.apache.hadoop.streaming.TestStreamingStderr|FAILED (timeout)|Known issue|HADOOP-6062|
|org.apache.hadoop.mapred.TestCapacitySchedulerConf|FAILED|Second run passed after deleting
capacity-scheduler.xml  from conf|?|



> KeyFieldBasedPartitioner would lost data if specifed field not exist
> --------------------------------------------------------------------
>
>                 Key: HADOOP-6052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6052
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6052-v1.0.patch
>
>
> When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field,
the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that
record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message