hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suraj Menon <surajsme...@apache.org>
Subject Re: Partitioner in Hama
Date Sun, 06 Jan 2013 20:28:17 GMT
>    1. I am referring to org.apache.hama.bsp.PartitioningRunner, it's named
>    as so in the HEAD (1429573) of trunk. It isn't removed but it isn't
>    referred to anywhere else. I can't find any references to it in the
>    workspace.
>

It is referred in BSPJob#waitForCompletion function as a separate BSP job
to create the specified splits.


>    2. job.setPartitioner is the same as setting
>    "bsp.input.partitioner.class" . Anyways , So acc. to me partitions are
> not
>    being created because of which the following happens.
>    If I am running the task on local fs and not hdfs, there's just one
>    input split and even if I set a partitioner to create two partitions and
>    set bsp.setNumTasks(2) , this is overriden and only one task is
> executed.
>    See BSPJobClient#submitJobInternal()
>    where it does the following
>    job.setNumBspTask(writeSplits(job, submitSplitFile, maxTasks)); Line
>    326.
>
> This job is set to run if the number of splits != number of Tasks or if
forced by the configuration. I can share my HAMA-700 current state of patch
with you.


>    3. So here is what I think is happening, Partitioner is not in the
>    codepath (try putting a breakpoint inside the partitioner and executing
> and
>    non graph bsp task), so partitions are not being created and
> writeSplits()
>    is returning 1.
>    [ writeSplits() returns the number of splits in the input. ]
>

Probably because it is running as a separate process?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message