hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "praveen sripati (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-561) Hama should support support consuming partitioned files
Date Tue, 22 May 2012 04:49:41 GMT

    [ https://issues.apache.org/jira/browse/HAMA-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280720#comment-13280720
] 

praveen sripati commented on HAMA-561:
--------------------------------------

The input partitioned files can be of the following format

<filename>-0
<filename>-1
<filename>-2
<filename>-3
....
....
....
<filename>-n

<filename>-0 is assigned to bsp task with id 0 for processing and so on. BSPJob.set("InputFilesPartitioned",
true); which defaults to false can be used to specify that the input files have been partitioned.


Also, when the input files have been partitioned, the framework has to make sure that the
partitioner class corresponding to the partitioned input files has been specified, so that
the bsp tasks can send messages to the appropriate bsp task. If the specified partitioner
class and the logic behind the partitioning of the input files doesn't match then the results
unpredictable.

Also, if InputFilesPartitioned parameter is not specified (defaulted to false) and the partitioner
class is specified, then Hama does the partitioning.
                
> Hama should support support consuming partitioned files
> -------------------------------------------------------
>
>                 Key: HAMA-561
>                 URL: https://issues.apache.org/jira/browse/HAMA-561
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp core
>    Affects Versions: 0.4.0
>            Reporter: praveen sripati
>            Priority: Minor
>
> Current the input partitioning is done when the job is submitted and the partitioner
has been specified. There might be a scenario where the input data has already been partiononed
or there might be a better way of partioning of the input data. So, Hama should be made aware
that the files are already partitioned files and the messages should only be sent to the appropriate
bsp task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message