hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-781) Setting partition split fails in local mode when file size is big and has a runtime partition (HashParitioner)
Date Thu, 25 Jul 2013 07:53:47 GMT

    [ https://issues.apache.org/jira/browse/HAMA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719353#comment-13719353
] 

Edward J. Yoon commented on HAMA-781:
-------------------------------------

{quote}
why not use counter and do partitioning?
{quote}

Each partition file should have own specific partition ID. So, I was used file name for identifying
partition ID. See PartitioningRunner.java

{code}

        for (int i = 0; i < files.length; i++) {
          LOG.debug("merge '" + files[i].getPath() + "' into " + partitionDir
              + "/" + getPartitionName(partitionID));


....

  private static String getPartitionName(int i) {
    return "part-" + String.valueOf(100000 + i).substring(1, 6);
  }
{code}


Your patch will be fine ("for (InputSplit split : splits) " will list the files in alphabetical
order), but incremental counter is not safe. Because, there's no way to check whether each
file's partition ID equals counter. 

What's the name of partition files in local mode? We need to find more safe way.
                
> Setting partition split fails in local mode when file size is big and has a runtime partition
(HashParitioner)
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HAMA-781
>                 URL: https://issues.apache.org/jira/browse/HAMA-781
>             Project: Hama
>          Issue Type: Bug
>          Components: bsp core
>            Reporter: Ikhtiyor Ahmedov
>            Priority: Minor
>         Attachments: HAMA-781.patch
>
>
> when input partitioner set to HashPartitioner and file size is big in local mode; in
line 566 of BSPJobClient.java throws index out of bound exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message