hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shuo Wang <ecisp.wangs...@gmail.com>
Subject Re: PageRank Experiment Iteration
Date Wed, 24 Oct 2012 08:55:07 GMT
Do you generate the data yourself? Can you provide the data generator for
me?

2012/10/24 Thomas Jungblut <thomas.jungblut@gmail.com>

> 12 gigs, it uses several more (up to 10?) times the memory than the dataset
> size.
>
> 2012/10/24 Shuo Wang <ecisp.wangshuo@gmail.com>
>
> > How large your data is? Our cluster has 10 nodes, 45 tasks, each task has
> > 512M memory. But when I run the 200M data, it has OUTOFMEMORY failure.
> >
> > 2012/10/24 Thomas Jungblut <thomas.jungblut@gmail.com>
> >
> > > Sure it does run, if you have enough ram ;)
> > >
> > > 2012/10/24 Shuo Wang <ecisp.wangshuo@gmail.com>
> > >
> > > > How much data have you run the pagerank on HAMA? Does it run? I want
> to
> > > run
> > > > large data for pagerank on HAMA, but it always fails.
> > > >
> > > > 2012/10/24 Thomas Jungblut <thomas.jungblut@gmail.com>
> > > >
> > > > > Yes it works on any directed graph.
> > > > > The best format to use is
> > > > >
> > > > > Vertex <\t> AdjacentVertex1 <\n> AdjacentVertex2 etc.
> > > > >
> > > > > So you have a adjacency list, and a vertex is represented by each
> > line.
> > > > > This is splittable, which the web-google dataset is not.
> > > > >
> > > > > 2012/10/24 Shuo Wang <ecisp.wangshuo@gmail.com>
> > > > >
> > > > > > Thanks! Does the pagerank work on any web graph? I generate
a
> > random
> > > > web
> > > > > > graph just like the data type of web-Google.txt, but the result
> is
> > > > > > infinity.
> > > > > >
> > > > > > 2012/10/24 Thomas Jungblut <thomas.jungblut@gmail.com>
> > > > > >
> > > > > > > Because graph iterations != supersteps. You have to take
the
> > > > > partitioning
> > > > > > > into account, the time to accumulate the number of vertices.
> > > Pagerank
> > > > > > > requires an additional superstep to run aggregators.
> > > > > > >
> > > > > > > 2012/10/24 Shuo Wang <ecisp.wangshuo@gmail.com>
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I have run the pagerank on HAMA, I set the max iteration
to
> 20,
> > > but
> > > > > it
> > > > > > > run
> > > > > > > > 48 supersteps. Why?
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message