Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@flink.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAJZ2dcW=iVYumYZ6jgJMCQ9yi_n9HaxoHju5MA55TN1NtpZdAA@mail.gmail.com>
References: 
 <CABtfLavhGsm1tQy7+6DSkNcNExTGDsuF46i1ds5cdkq_R33MBQ@mail.gmail.com>
	<CAGco--Zm67bGYnWYFhDrJWwDvwFJ1sbFCrr5Zz1j5w8uXC2JcA@mail.gmail.com>
	<CABtfLavOf7dk5m9Zo0f1CBw1ivxbqe4Smd0eAwn48JXj04Wf7Q@mail.gmail.com>
	<CAGco--bWzTOZMfML3+-+N561vBegxV1797M2Nj2ZxLo3BMwnCw@mail.gmail.com>
	<CAJZ2dcW=iVYumYZ6jgJMCQ9yi_n9HaxoHju5MA55TN1NtpZdAA@mail.gmail.com>
Date: Mon, 20 Jul 2015 15:43:25 +0200
Message-ID: 
 <CABtfLavX3PcprN7K6NC11HyHBpkTvJMioyykXU=dezhG1YNK+w@mail.gmail.com>
Subject: Re: Too few memory segments provided exception
From: Shivani Ghatge <shghatge@gmail.com>
To: user@flink.apache.org
Content-Type: multipart/alternative; boundary=f46d04182704d44daf051b4eb811

--f46d04182704d44daf051b4eb811
Content-Type: text/plain; charset=UTF-8

Hello Vasia,

As I had mentioned before, I need a BloomFilter as well as a HashSet for
the approximation to work. In the exact solution I am getting two HashSets
and comparing them. In approximate version, if we get two BloomFilters then
we have no way to compare the neighborhood sets.

I thought we agreed that the BloomFilters are to be sent as messages to the
vertices?

The exact version is passing all the tests.

On removing the final GroupReduce the program is working but I need it to
add the Partial Adamic Adar edges weights.

On Mon, Jul 20, 2015 at 3:15 PM, Vasiliki Kalavri <vasilikikalavri@gmail.com
> wrote:

> Hi Shivani,
>
> why are you using a vertex-centric iteration to compute the approximate
> Adamic-Adar?
> It's not an iterative computation :)
>
> In fact, it should be as complex (in terms of operators) as the exact
> Adamic-Adar, only more efficient because of the different neighborhood
> representation. Are you having the same problem with the exact computation?
>
> Cheers,
> Vasia.
>
> On 20 July 2015 at 14:41, Maximilian Michels <mxm@apache.org> wrote:
>
>> Hi Shivani,
>>
>> The issue is that by the time the Hash Join is executed, the
>> MutableHashTable cannot allocate enough memory segments. That means that
>> your other operators are occupying them. It is fine that this also occurs
>> on Travis because the workers there have limited memory as well.
>>
>> Till suggested to change the memory fraction through the
>> ExuectionEnvironment. Can you try that?
>>
>> Cheers,
>> Max
>>
>> On Mon, Jul 20, 2015 at 2:23 PM, Shivani Ghatge <shghatge@gmail.com>
>> wrote:
>>
>>> Hello Maximilian,
>>>
>>> Thanks for the suggestion. I will use it to check the program. But when
>>> I am creating a PR for the same implementation with a Test, I am getting
>>> the same error even on Travis build. So for that what would be the
>>> solution?
>>>
>>> Here is my PR https://github.com/apache/flink/pull/923
>>> And here is the Travis build status
>>> https://travis-ci.org/apache/flink/builds/71695078
>>>
>>> Also on the IDE it is working fine in Collection execution mode.
>>>
>>> Thanks and Regards,
>>> Shivani
>>>
>>> On Mon, Jul 20, 2015 at 2:14 PM, Maximilian Michels <mxm@apache.org>
>>> wrote:
>>>
>>>> Hi Shivani,
>>>>
>>>> Flink doesn't have enough memory to perform a hash join. You need to
>>>> provide Flink with more memory. You can either increase the
>>>> "taskmanager.heap.mb" config variable or set "taskmanager.memory.fraction"
>>>> to some value greater than 0.7 and smaller then 1.0. The first config
>>>> variable allocates more overall memory for Flink; the latter changes the
>>>> ratio between Flink managed memory (e.g. for hash join) and user memory
>>>> (for you functions and Gelly's code).
>>>>
>>>> If you run this inside an IDE, the memory is configured automatically
>>>> and you don't have control over that at the moment. You could, however,
>>>> start a local cluster (./bin/start-local) after you adjusted your
>>>> flink-conf.yaml and run your programs against that configured cluster. You
>>>> can do that either through your IDE using a RemoteEnvironment or by
>>>> submitting the packaged JAR to the local cluster using the command-line
>>>> tool (./bin/flink).
>>>>
>>>> Hope that helps.
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>> On Mon, Jul 20, 2015 at 2:04 PM, Shivani Ghatge <shghatge@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>  I am working on a problem which implements Adamic Adar Algorithm
>>>>> using Gelly.
>>>>> I am running into this exception for all the Joins (including the one
>>>>> that are part of the reduceOnNeighbors function)
>>>>>
>>>>> Too few memory segments provided. Hash Join needs at least 33 memory
>>>>> segments.
>>>>>
>>>>>
>>>>> The problem persists even when I comment out some of the joins.
>>>>>
>>>>> Even after using edg = edg.join(graph.getEdges(),
>>>>> JoinOperatorBase.JoinHint.BROADCAST_HASH_SECOND).where(0,1).equalTo(0,1).with(new
>>>>> JoinEdge());
>>>>>
>>>>> as suggested by @AndraLungu the problem persists.
>>>>>
>>>>> The code is
>>>>>
>>>>>
>>>>> DataSet<Tuple2<Long, Long>> degrees = graph.getDegrees();
>>>>>
>>>>>         //get neighbors of each vertex in the HashSet for it's value
>>>>>         computedNeighbors = graph.reduceOnNeighbors(new
>>>>> GatherNeighbors(), EdgeDirection.ALL);
>>>>>
>>>>>         //get vertices with updated values for the final Graph which
>>>>> will be used to get Adamic Edges
>>>>>         Vertices = computedNeighbors.join(degrees,
>>>>> JoinOperatorBase.JoinHint.BROADCAST_HASH_FIRST).where(0).equalTo(0).with(new
>>>>> JoinNeighborDegrees());
>>>>>
>>>>>         Graph<Long, Tuple3<Double, HashSet<Long>, List<Tuple3<Long,
>>>>> Long, Double>>>, Double> updatedGraph =
>>>>>                 Graph.fromDataSet(Vertices, edges, env);
>>>>>
>>>>>         //configure Vertex Centric Iteration
>>>>>         VertexCentricConfiguration parameters = new
>>>>> VertexCentricConfiguration();
>>>>>
>>>>>         parameters.setName("Find Adamic Adar Edge Weights");
>>>>>
>>>>>         parameters.setDirection(EdgeDirection.ALL);
>>>>>
>>>>>         //run Vertex Centric Iteration to get the Adamic Adar Edges
>>>>> into the vertex Value
>>>>>         updatedGraph = updatedGraph.runVertexCentricIteration(new
>>>>> GetAdamicAdarEdges<Long>(), new NeighborsMessenger<Long>(), 1, parameters);
>>>>>
>>>>>         //Extract Vertices of the updated graph
>>>>>         DataSet<Vertex<Long, Tuple3<Double, HashSet<Long>,
>>>>> List<Tuple3<Long, Long, Double>>>>> vertices = updatedGraph.getVertices();
>>>>>
>>>>>         //Extract the list of Edges from the vertex values
>>>>>         DataSet<Tuple3<Long, Long, Double>> edg = vertices.flatMap(new
>>>>> GetAdamicList());
>>>>>
>>>>>         //Partial weights for the edges are added
>>>>>         edg = edg.groupBy(0,1).reduce(new AdamGroup());
>>>>>
>>>>>         //Graph is updated with the Adamic Adar Edges
>>>>>         edg = edg.join(graph.getEdges(),
>>>>> JoinOperatorBase.JoinHint.BROADCAST_HASH_SECOND).where(0,1).equalTo(0,1).with(new
>>>>> JoinEdge());
>>>>>
>>>>> Any idea how I could tackle this Exception?
>>>>>
>>>>
>>>>
>>>
>>
>

--f46d04182704d44daf051b4eb811
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div><div>Hello Vasia,<br><br></div>As I had men=
tioned before, I need a BloomFilter as well as a HashSet for the approximat=
ion to work. In the exact solution I am getting two HashSets and comparing =
them. In approximate version, if we get two BloomFilters then we have no wa=
y to compare the neighborhood sets.<br><br></div>I thought we agreed that t=
he BloomFilters are to be sent as messages to the vertices?<br><br></div>Th=
e exact version is passing all the tests.<br><br></div>On removing the fina=
l GroupReduce the program is working but I need it to add the Partial Adami=
c Adar edges weights.<br></div><div class=3D"gmail_extra"><br><div class=3D=
"gmail_quote">On Mon, Jul 20, 2015 at 3:15 PM, Vasiliki Kalavri <span dir=
=3D"ltr">&lt;<a href=3D"mailto:vasilikikalavri@gmail.com" target=3D"_blank"=
>vasilikikalavri@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gm=
ail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-le=
ft:1ex"><div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:=
verdana,sans-serif;font-size:small;color:#330000">Hi Shivani,</div><div cla=
ss=3D"gmail_default" style=3D"font-family:verdana,sans-serif;font-size:smal=
l;color:#330000"><br></div><div class=3D"gmail_default" style=3D"font-famil=
y:verdana,sans-serif;font-size:small;color:#330000">why are you using a ver=
tex-centric iteration to compute the approximate Adamic-Adar?</div><div cla=
ss=3D"gmail_default" style=3D"font-family:verdana,sans-serif;font-size:smal=
l;color:#330000">It&#39;s not an iterative computation :)=C2=A0</div><div c=
lass=3D"gmail_default" style=3D"font-family:verdana,sans-serif;font-size:sm=
all;color:#330000"><br></div><div class=3D"gmail_default" style=3D"font-fam=
ily:verdana,sans-serif;font-size:small;color:#330000">In fact, it should be=
 as complex (in terms of operators) as the exact Adamic-Adar, only more eff=
icient because of the different neighborhood representation. Are you having=
 the same problem with the exact computation?</div><div class=3D"gmail_defa=
ult" style=3D"font-family:verdana,sans-serif;font-size:small;color:#330000"=
><br></div><div class=3D"gmail_default" style=3D"font-family:verdana,sans-s=
erif;font-size:small;color:#330000">Cheers,</div><div class=3D"gmail_defaul=
t" style=3D"font-family:verdana,sans-serif;font-size:small;color:#330000">V=
asia.</div></div><div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmai=
l_extra"><br><div class=3D"gmail_quote">On 20 July 2015 at 14:41, Maximilia=
n Michels <span dir=3D"ltr">&lt;<a href=3D"mailto:mxm@apache.org" target=3D=
"_blank">mxm@apache.org</a>&gt;</span> wrote:<br><blockquote class=3D"gmail=
_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:=
1ex"><div dir=3D"ltr">Hi Shivani,<div class=3D"gmail_extra"><br></div><div =
class=3D"gmail_extra">The issue is that by the time the Hash Join is execut=
ed, the MutableHashTable cannot allocate enough memory segments. That means=
 that your other operators are occupying them. It is fine that this also oc=
curs on Travis because the workers there have limited memory as well.<br><b=
r></div><div class=3D"gmail_extra">Till suggested to change the memory frac=
tion through the ExuectionEnvironment. Can you try that? <br><br></div><div=
 class=3D"gmail_extra">Cheers,<br></div><div class=3D"gmail_extra">Max<br><=
/div><div><div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On=
 Mon, Jul 20, 2015 at 2:23 PM, Shivani Ghatge <span dir=3D"ltr">&lt;<a href=
=3D"mailto:shghatge@gmail.com" target=3D"_blank">shghatge@gmail.com</a>&gt;=
</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div=
><div><div><div><div>Hello Maximilian,<br><br></div>Thanks for the suggesti=
on. I will use it to check the program. But when I am creating a PR for the=
 same implementation with a Test, I am getting the same error even on Travi=
s build. So for that what would be the solution? <br><br></div>Here is my P=
R <a href=3D"https://github.com/apache/flink/pull/923" target=3D"_blank">ht=
tps://github.com/apache/flink/pull/923</a><br></div>And here is the Travis =
build status <a href=3D"https://travis-ci.org/apache/flink/builds/71695078"=
 target=3D"_blank">https://travis-ci.org/apache/flink/builds/71695078</a><b=
r><br></div>Also on the IDE it is working fine in Collection execution mode=
.<br><br></div>Thanks and Regards,<br></div>Shivani <br></div><div class=3D=
"gmail_extra"><br><div class=3D"gmail_quote"><span>On Mon, Jul 20, 2015 at =
2:14 PM, Maximilian Michels <span dir=3D"ltr">&lt;<a href=3D"mailto:mxm@apa=
che.org" target=3D"_blank">mxm@apache.org</a>&gt;</span> wrote:<br></span><=
div><div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div>Hi Shiva=
ni,<br><br>Flink doesn&#39;t have enough memory to perform a hash join. You=
 need to provide Flink with more memory. You can either increase the &quot;=
taskmanager.heap.mb&quot; config variable or set &quot;taskmanager.memory.f=
raction&quot; to some value greater than 0.7 and smaller then 1.0. The firs=
t config variable allocates more overall memory for Flink; the latter chang=
es the ratio between Flink managed memory (e.g. for hash join) and user mem=
ory (for you functions and Gelly&#39;s code).<br><br></div><div>If you run =
this inside an IDE, the memory is configured automatically and you don&#39;=
t have control over that at the moment. You could, however, start a local c=
luster (./bin/start-local) after you adjusted your flink-conf.yaml and run =
your programs against that configured cluster. You can do that either throu=
gh your IDE using a RemoteEnvironment or by submitting the packaged JAR to =
the local cluster using the command-line tool (./bin/flink).<br><br></div><=
div>Hope that helps.<br></div><div><br></div>Cheers,<br></div>Max<br></div>=
<div><div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Mon,=
 Jul 20, 2015 at 2:04 PM, Shivani Ghatge <span dir=3D"ltr">&lt;<a href=3D"m=
ailto:shghatge@gmail.com" target=3D"_blank">shghatge@gmail.com</a>&gt;</spa=
n> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;b=
order-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div><div=
>Hello,<br></div>=C2=A0I am working on a problem which implements Adamic Ad=
ar Algorithm using Gelly.<br></div>I am running into this exception for all=
 the Joins (including the one that are part of the reduceOnNeighbors functi=
on)<br><br>Too few memory segments provided. Hash Join needs at least 33 me=
mory segments.<br><br><br></div><div>The problem persists even when I comme=
nt out some of the joins.<br><br></div><div>Even after using edg =3D edg.jo=
in(graph.getEdges(), JoinOperatorBase.JoinHint.BROADCAST_HASH_SECOND).where=
(0,1).equalTo(0,1).with(new JoinEdge());<br><br></div><div>as suggested by =
@AndraLungu the problem persists.<br><br></div><div>The code is <br><br><br=
>DataSet&lt;Tuple2&lt;Long, Long&gt;&gt; degrees =3D graph.getDegrees();<br=
><br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 //get neighbors of each vertex i=
n the HashSet for it&#39;s value<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 c=
omputedNeighbors =3D graph.reduceOnNeighbors(new GatherNeighbors(), EdgeDir=
ection.ALL);<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 <br>=C2=A0=C2=A0=C2=
=A0 =C2=A0=C2=A0=C2=A0 //get vertices with updated values for the final Gra=
ph which will be used to get Adamic Edges<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=
=A0=C2=A0 Vertices =3D computedNeighbors.join(degrees, JoinOperatorBase.Joi=
nHint.BROADCAST_HASH_FIRST).where(0).equalTo(0).with(new JoinNeighborDegree=
s());<br><br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 Graph&lt;Long, Tuple3&lt=
;Double, HashSet&lt;Long&gt;, List&lt;Tuple3&lt;Long, Long, Double&gt;&gt;&=
gt;, Double&gt; updatedGraph =3D <br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 Graph.fromDataSet(Vertices, edges, en=
v);<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 <br>=C2=A0=C2=A0=C2=A0 =C2=A0=
=C2=A0=C2=A0 //configure Vertex Centric Iteration<br>=C2=A0=C2=A0=C2=A0 =C2=
=A0=C2=A0=C2=A0 VertexCentricConfiguration parameters =3D new VertexCentric=
Configuration();<br><br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 parameters.se=
tName(&quot;Find Adamic Adar Edge Weights&quot;);<br><br>=C2=A0=C2=A0=C2=A0=
 =C2=A0=C2=A0=C2=A0 parameters.setDirection(EdgeDirection.ALL);<br>=C2=A0=
=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 <br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 /=
/run Vertex Centric Iteration to get the Adamic Adar Edges into the vertex =
Value <br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 updatedGraph =3D updatedGra=
ph.runVertexCentricIteration(new GetAdamicAdarEdges&lt;Long&gt;(), new Neig=
hborsMessenger&lt;Long&gt;(), 1, parameters);<br>=C2=A0=C2=A0=C2=A0 =C2=A0=
=C2=A0=C2=A0 <br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 //Extract Vertices o=
f the updated graph<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 DataSet&lt;Ver=
tex&lt;Long, Tuple3&lt;Double, HashSet&lt;Long&gt;, List&lt;Tuple3&lt;Long,=
 Long, Double&gt;&gt;&gt;&gt;&gt; vertices =3D updatedGraph.getVertices();<=
br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 <br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=
=A0=C2=A0 //Extract the list of Edges from the vertex values<br>=C2=A0=C2=
=A0=C2=A0 =C2=A0=C2=A0=C2=A0 DataSet&lt;Tuple3&lt;Long, Long, Double&gt;&gt=
; edg =3D vertices.flatMap(new GetAdamicList());<br>=C2=A0=C2=A0=C2=A0 =C2=
=A0=C2=A0=C2=A0 <br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 //Partial weights=
 for the edges are added<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 edg =3D e=
dg.groupBy(0,1).reduce(new AdamGroup());<br><br>=C2=A0=C2=A0=C2=A0 =C2=A0=
=C2=A0=C2=A0 //Graph is updated with the Adamic Adar Edges<br>=C2=A0=C2=A0=
=C2=A0 =C2=A0=C2=A0=C2=A0 edg =3D edg.join(graph.getEdges(), JoinOperatorBa=
se.JoinHint.BROADCAST_HASH_SECOND).where(0,1).equalTo(0,1).with(new JoinEdg=
e());<br><br></div><div>Any idea how I could tackle this Exception?<br></di=
v></div>
</blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div>
</blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--f46d04182704d44daf051b4eb811--