hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuesheng Hu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HAMA-647) Make the input spliter robustly
Date Thu, 27 Sep 2012 12:11:07 GMT

     [ https://issues.apache.org/jira/browse/HAMA-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yuesheng Hu updated HAMA-647:
-----------------------------

    Attachment: HAMA-647_3.patch

Add condition for "the number of input files larger than task numSplits", it will print a
helpful message and exit the program.

I will attach Unit Test later.
                
> Make the  input spliter robustly
> --------------------------------
>
>                 Key: HAMA-647
>                 URL: https://issues.apache.org/jira/browse/HAMA-647
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp core
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Yuesheng Hu
>            Assignee: Yuesheng Hu
>            Priority: Critical
>             Fix For: 0.6.0
>
>         Attachments: HAMA-647-2.patch, HAMA-647_3.patch, HAMA-647.patch
>
>
> Currently, the spliter in FileInputFormat is based on the Mapreduce's spliter. But, Hama
is different from Mapreduce, Hama's task can not be  pended until the slot becomes free. 
So, the current spliter is not suitable for Hama. When input file is small, it may be ok,
but when input is  very large, the number of splits will be very large too, even our cluster
is powerful enough to handle the input. More details, please see the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message