hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-963) ArrayIndexOutOfBoundsException occurs when tasks are greater than splits
Date Fri, 07 Aug 2015 02:23:45 GMT

    [ https://issues.apache.org/jira/browse/HAMA-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661208#comment-14661208
] 

Edward J. Yoon commented on HAMA-963:
-------------------------------------

{code}
I wrote a very small Hama program to test it on a Yarn cluster running on my laptop to isolate
the problem:

final public class BSPTest extends BSP<LongWritable, Text, LongWritable, Text, Text>
{

    @Override
    public final void bsp( BSPPeer<LongWritable, Text, LongWritable, Text, Text> peer)
                  throws IOException, InterruptedException, SyncException {
        LongWritable key = new LongWritable();
        Text value = new Text();
        peer.readNext(key,value);
        peer.write(key,value);
    }

    public static void main ( String[] args ) throws Exception {
        HamaConfiguration conf = new HamaConfiguration();
conf.set("yarn.resourcemanager.address","localhost:8032");
        YARNBSPJob job = new YARNBSPJob(conf);
        job.setMemoryUsedPerTaskInMb(500);
        job.setNumBspTask(4);
        job.setJobName("test");
        job.setBspClass(BSPTest.class);
        job.setJarByClass(BSPTest.class);
        job.setInputKeyClass(LongWritable.class);
        job.setInputValueClass(Text.class);
        job.setInputPath(new Path("in"));
        job.setInputFormat(TextInputFormat.class);
job.setPartitioner(org.apache.hama.bsp.HashPartitioner.class);
job.set("bsp.min.split.size",Long.toString(1000));
        job.setOutputPath(new Path("out"));
        job.setOutputKeyClass(LongWritable.class);
        job.setOutputValueClass(Text.class);
        job.setOutputFormat(TextOutputFormat.class);
        job.waitForCompletion(true);
    }
}

where "in" is a small text file stored on HDFS. It does the file partitioning into 4 files
but then it gives me the same error:

15/07/26 06:46:25 INFO ipc.Server: IPC Server handler 0 on 10000, call getTask(attempt_appattempt_1437858941768_0042_000001_0000_000004_4)
from 127.0.0.1:54752: error: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException:
4
java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 4
    at org.apache.hama.bsp.ApplicationMaster.getTask(ApplicationMaster.java:950)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.hama.ipc.RPC$Server.call(RPC.java:615)
    at org.apache.hama.ipc.Server$Handler$1.run(Server.java:1211)
    at org.apache.hama.ipc.Server$Handler$1.run(Server.java:1207)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

I get the same error even when I remove the partitioning and I use 1 task.
{code}

> ArrayIndexOutOfBoundsException occurs when tasks are greater than splits
> ------------------------------------------------------------------------
>
>                 Key: HAMA-963
>                 URL: https://issues.apache.org/jira/browse/HAMA-963
>             Project: Hama
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Edward J. Yoon
>            Priority: Blocker
>             Fix For: 0.7.1
>
>
> ArrayIndexOutOfBoundsException occurs when the number of tasks are greater than the number
of splits at ApplicationMaster 950 line.
> {code}
>       assignedSplit = splits[taskid.id];
> {code}
> There are two options: Option1. launch additional tasks without input split.
> Option2. adjust the number of tasks as number of input splits.
> I prefer the option 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message