hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 顾荣 <gurongwal...@gmail.com>
Subject Re: Some questions about execution workflow of HamaGraph.
Date Wed, 19 Sep 2012 15:05:57 GMT
Hi Thomas,

I just read this part of code in the *submitJobInternal*() function of
*org.apache.hama.bsp.BSPJobClient.
*As you mentioned.raw BSPs have the opportunity to partition before the job,
*// Create the splits for the job
      LOG.debug("Creating splits at " + fs.makeQualified(submitSplitFile));
      if (job.getConf().get("bsp.input.partitioner.class") != null
          && !job.getConf()
              .getBoolean("hama.graph.runtime.partitioning", false)) {
        job = partition(job, maxTasks);
        maxTasks = job.getInt("hama.partition.count", maxTasks);
      }*

By the way, if I do not part

2012/9/19 Thomas Jungblut <thomas.jungblut@gmail.com>

> Hey,
>
> the file is getting split like Hadoop does it, defined by the inputformat.
> It will be partitioned during runtime, raw BSPs have the opportunity to
> partition before the job, but this is not soo scalable so we have not done
> this in graph algorithms. There is no load balancing besides the usual hash
> partitioning. However you can write your own partitioner to distribute the
> vertices, we are going to provide work stealing in the future so the load
> balancing gets better.
>
>
> 2012/9/19 Yuesheng Hu <yueshenghu@gmail.com>
>
> > org.apache.hama.graph.GraphJobRunner is the most important class in
> should
> > read, also  other classes in org.apache.hama.graph
> >
> >
> > 2012/9/19 顾荣 <gurongwalker@gmail.com>
> >
> > > Hi All,I have some questions about your design in HamaGraph. Let me
> take
> > > the PageRank example to illustrate my questions.
> > >
> > > I have 3 Groom Servers each with 3 free BSP task nodes in my Hama
> > > cluster.The input file is as blow.
> > >
> > > "stackoverflow.com    yahoo.com
> > > facebook.com    twitter.com    google.com    nasa.gov
> > > yahoo.com    nasa.gov    stackoverflow.com
> > > twitter.com    google.com    facebook.com
> > > nasa.gov    yahoo.com    stackoverflow.com
> > > youtube.com    google.com    yahoo.com
> > > "
> > > In this case, there are 6 vertexs. How do you assign them among these
> > task
> > > nodes? Can it guarantee load balancing? And, Do you support a function
> to
> > > supply to customize their own vertex assignment policy? I am so
> confused
> > > with the tasks split part of Hama, it seems the same as Hadoop (by
> input
> > > splits) from its source code, but it works different. And does the task
> > > split part of HamaBSP is the same as HamaGraph?
> > >
> > > Would you please give some info about that? If you are busy to answer
> my
> > > questions, please kindly point it out to me that in which classes or
> > > functions of source code you implemented what I am confused about, I
> > think
> > > I read it more myself.
> > >
> > > Anyway,Thanks again.
> > >
> > > Walker
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message