hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HAMA-949) File splits based on number of input files
Date Thu, 21 May 2015 00:49:59 GMT

     [ https://issues.apache.org/jira/browse/HAMA-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Edward J. Yoon resolved HAMA-949.
---------------------------------
    Resolution: Won't Fix

This maybe not good idea. We have to consider the locality issue even if we want to force
set the number of tasks.

I'll handle this on HAMA-956.

> File splits based on number of input files
> ------------------------------------------
>
>                 Key: HAMA-949
>                 URL: https://issues.apache.org/jira/browse/HAMA-949
>             Project: Hama
>          Issue Type: Improvement
>    Affects Versions: 0.6.4
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.7.0
>
>         Attachments: patch.txt
>
>
> I've create multiple input files considering max task capacity of cluster, but it wasn't
able to run. Because, currently file splits are determined based on number of blocks. 
> I don't know why below code has been removed. What if add this again?
> {code}
>     // take the short circuit path if we have already partitioned
>     if (numSplits == files.length) {
>       for (FileStatus file : files) {
>         if (file != null) {
>           splits.add(new FileSplit(file.getPath(), 0, file.getLen(),
>               new String[0]));
>         }
>       }
>       return splits.toArray(new FileSplit[splits.size()]);
>     }
> {code}
> https://www.mail-archive.com/commits@hama.apache.org/msg00319.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message