incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chia-Hung Lin <cli...@googlemail.com>
Subject Re: Share TODO-list for hama-0.2
Date Fri, 11 Feb 2011 08:37:42 GMT
2011/2/10 Edward J. Yoon <edwardyoon@apache.org>:
> Because, the unit tests of TestBSPMaster are almost commented text at
> this time. We have to uncomment and fix bugs.
>

I did not notice the code in TestBSPMaster was commented out. I will fix that.

> P.S. Now I'm starting the implement of kill job method and I have a
> question about your code.
>
> To kill the child processes, GroomServer have to know which tasks are
> no longer valid and should be killed. As you know we removed some data
> structures e.g., taskIdToGroomNameMap, TaskIdToTaskInProgressMap, ...
> from BSPMaster.
>
> So, I'm thinking, 1) checks the job's state while GroomServer running,
> 2) and kills and cleans them.
>
> Then, same as it was based on heart-beat communication between
> BSPMaster and GroomServer. You got a better idea?

I feel the same way because we can check task status in JobInProgress.
But we may also need to add feature making GroomServer periodically
report/ update task status back to bsp master because the patch for
new nexus only do report at the end when a task is finished.

>
> On Thu, Feb 10, 2011 at 4:39 PM, Chia-Hung Lin <clin4j@googlemail.com> wrote:
>> 2011/2/9 Edward J. Yoon <edwardyoon@apache.org>:
>>> Hi all,
>>>
>>> Here is our TODO-list:
>>>
>>>  1) Below job commands are not implemented yet
>>>
>>>        [-kill <job-id>]
>>>        [-list-attempt-ids <job-id> <task-state>]
>>>        [-kill-task <task-id>]
>>>        [-fail-task <task-id>]
>>>
>>> 2) TestBSPMaster doesn't work.
>>>
>>> It seems like a bug by adding umbilical interface to GroomServer.
>>> Anyone can figure out this problem?
>>
>> Is there any log information available? The unit test on my machine
>> looks working as it reports:
>>
>> test:
>>    ...
>>    [junit] Running org.apache.hama.bsp.TestBSPMaster
>>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.248 sec
>>    [junit] Running org.apache.hama.bsp.TestBSPPeer
>>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 71.882 sec
>>    ...
>> BUILD SUCCESSFUL
>> Total time: 1 minute 51 seconds
>>
>> But when testing with SerializePrinting example, it would returns
>>
>>    java.io.FileNotFoundException: File does not exist: /tmp/test-example/0
>>
>> exception. Seems that SerializePrinting will delete test-example dir
>> even if it exists.
>>
>>>
>>> 3) Lack of comments and documentations
>>>
>>> 4) Increase committers
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> http://blog.udanax.org
>>> http://twitter.com/eddieyoon
>>>
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> http://blog.udanax.org
> http://twitter.com/eddieyoon
>

Mime
View raw message