flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasiliki Kalavri <vasilikikala...@gmail.com>
Subject Re: RuntimeException Gelly API: Memory ran out. Compaction failed.
Date Wed, 18 Mar 2015 16:42:56 GMT
Hi Robert,

my setup has even less memory than your setup, ~900MB in total.

When using the local environment (running the job through your IDE), the
available of memory is split equally between the JobManager and
TaskManager. Then, the default memory kept for network buffers is
subtracted from the TaskManager's part.
Finally, the TaskManager is assigned 70% (by default) of what is left.
In my case, this was 255MB.

So, I'm guessing that either the options you're passing to eclipse are not
properly read (I haven't tried it myself) or that there's something wrong
in the way you're generating the graph. That's why I asked how you produce
the vertex dataset.

Cheers,
V.



On 18 March 2015 at 18:27, Robert Waury <robert.waury@googlemail.com> wrote:

> Hi Vasia,
>
> How much memory does your job use?
>
> I think the problem is as Stephan says a too conservative allocation but
> that it will work if you throw enough memory at it.
>
> Or did your setup succeed with an amount of memory comparable to Mihail's
> and mine?
>
> My main point is that it shouldn't take 10x more memory than the input
> size for such a job.
>
> Cheers,
> Robert
> On Mar 18, 2015 5:06 PM, "Vasiliki Kalavri" <vasilikikalavri@gmail.com>
> wrote:
>
>> Hi Mihail, Robert,
>>
>> I've tried reproducing this, but I couldn't.
>> I'm using the same twitter input graph from SNAP that you link to and
>> also Scala IDE.
>> The job finishes without a problem (both the SSSP example from Gelly and
>> the unweighted version).
>>
>> The only thing I changed to run your version was creating the graph from
>> the edge set only, i.e. like this:
>>
>> Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
>> new MapFunction<Long, Long>() {
>> public Long map(Long value) {
>> return Long.MAX_VALUE;
>> }
>> }, env);
>>
>> Since the twitter input is an edge list, how do you generate the vertex
>> dataset in your case?
>>
>> Thanks,
>> -Vasia.
>>
>> On 18 March 2015 at 16:54, Mihail Vieru <vieru@informatik.hu-berlin.de>
>> wrote:
>>
>>>  Hi,
>>>
>>> great! Thanks!
>>>
>>> I really need this bug fixed because I'm laying the groundwork for my
>>> Diplom thesis and I need to be sure that the Gelly API is reliable and can
>>> handle large datasets as intended.
>>>
>>> Cheers,
>>> Mihail
>>>
>>>
>>> On 18.03.2015 15:40, Robert Waury wrote:
>>>
>>>   Hi,
>>>
>>>  I managed to reproduce the behavior and as far as I can tell it seems
>>> to be a problem with the memory allocation.
>>>
>>>  I have filed a bug report in JIRA to get the attention of somebody who
>>> knows the runtime better than I do.
>>>
>>> https://issues.apache.org/jira/browse/FLINK-1734
>>>
>>>  Cheers,
>>>  Robert
>>>
>>> On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <
>>> vieru@informatik.hu-berlin.de> wrote:
>>>
>>>>  Hi Robert,
>>>>
>>>> thank you for your reply.
>>>>
>>>> I'm starting the job from the Scala IDE. So only one JobManager and one
>>>> TaskManager in the same JVM.
>>>> I've doubled the memory in the eclipse.ini settings but I still get the
>>>> Exception.
>>>>
>>>> -vmargs
>>>> -Xmx2048m
>>>> -Xms100m
>>>> -XX:MaxPermSize=512m
>>>>
>>>> Best,
>>>> Mihail
>>>>
>>>>
>>>> On 17.03.2015 10:11, Robert Waury wrote:
>>>>
>>>>   Hi,
>>>>
>>>>  can you tell me how much memory your job has and how many workers you
>>>> are running?
>>>>
>>>>  From the trace it seems the internal hash table allocated only 7 MB
>>>> for the graph data and therefore runs out of memory pretty quickly.
>>>>
>>>>  Skewed data could also be an issue but with a minimum of 5 pages and
>>>> a maximum of 8 it seems to be distributed fairly even to the different
>>>> partitions.
>>>>
>>>>  Cheers,
>>>>  Robert
>>>>
>>>> On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <
>>>> vieru@informatik.hu-berlin.de> wrote:
>>>>
>>>>> And the correct SSSPUnweighted attached.
>>>>>
>>>>>
>>>>> On 17.03.2015 01:23, Mihail Vieru wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm getting the following RuntimeException for an adaptation of the
>>>>>> SingleSourceShortestPaths example using the Gelly API (see attachment).
>>>>>> It's been adapted for unweighted graphs having vertices with Long
values.
>>>>>>
>>>>>> As an input graph I'm using the social network graph (~200MB
>>>>>> unpacked) from here:
>>>>>> https://snap.stanford.edu/data/higgs-twitter.html
>>>>>>
>>>>>> For the small SSSPDataUnweighted graph (also attached) it terminates
>>>>>> and computes the distances correctly.
>>>>>>
>>>>>>
>>>>>> 03/16/2015 17:18:23    IterationHead(WorksetIteration (Vertex-centric
>>>>>> iteration
>>>>>> (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4
>>>>>> |
>>>>>> org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4)
>>>>>> switched to FAILED
>>>>>> java.lang.RuntimeException: Memory ran out. Compaction failed.
>>>>>> numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow
>>>>>> segments: 176 bucketSize: 217 Overall memory: 20316160 Partition
memory:
>>>>>> 7208960 Message: Index: 8, Size: 7
>>>>>>     at
>>>>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
>>>>>>     at
>>>>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
>>>>>>     at
>>>>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
>>>>>>     at
>>>>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
>>>>>>     at
>>>>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
>>>>>>     at
>>>>>> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
>>>>>>     at java.lang.Thread.run(Thread.java:745)
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>> Mihail
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>

Mime
View raw message