hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jamal sasha <jamalsha...@gmail.com>
Subject Dealing with stragglers in hadoop
Date Fri, 15 Nov 2013 08:44:23 GMT
Hi,
  I have a very simple use case...
Basically I have an edge list and I am trying to convert it into adjacency
list..
Basically

src target
a     b
a    c
b    d
b    e

and so on..
What I am trying to build is

a [b,c]
b [d,e]
.. and so on..

But every now and then.. I hit a super node..which has millions of edges..

Thus keying on just node id is results in poor MR execution because of this
straggler reducer..

I have been trying to understand partitioner.. but I am at lost how to use
it here?

How do i solve this straggler issue?
Thanks

Mime
View raw message