flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mihail Vieru <vi...@informatik.hu-berlin.de>
Subject Re: RuntimeException Gelly API: Memory ran out. Compaction failed.
Date Wed, 18 Mar 2015 17:21:02 GMT
Hi Vasia,

I have used a simple job (attached) to generate a file which looks like 
this:

0 0
1 1
2 2
...
456629 456629
456630 456630

I need the vertices to be generated from a file for my future work.

Cheers,
Mihail


On 18.03.2015 17:04, Vasiliki Kalavri wrote:
> Hi Mihail, Robert,
>
> I've tried reproducing this, but I couldn't.
> I'm using the same twitter input graph from SNAP that you link to and 
> also Scala IDE.
> The job finishes without a problem (both the SSSP example from Gelly 
> and the unweighted version).
>
> The only thing I changed to run your version was creating the graph 
> from the edge set only, i.e. like this:
>
> Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
> new MapFunction<Long, Long>() {
> public Long map(Long value) {
> return Long.MAX_VALUE;
> }
> }, env);
> Since the twitter input is an edge list, how do you generate the 
> vertex dataset in your case?
>
> Thanks,
> -Vasia.
>
> On 18 March 2015 at 16:54, Mihail Vieru <vieru@informatik.hu-berlin.de 
> <mailto:vieru@informatik.hu-berlin.de>> wrote:
>
>     Hi,
>
>     great! Thanks!
>
>     I really need this bug fixed because I'm laying the groundwork for
>     my Diplom thesis and I need to be sure that the Gelly API is
>     reliable and can handle large datasets as intended.
>
>     Cheers,
>     Mihail
>
>
>     On 18.03.2015 15:40, Robert Waury wrote:
>>     Hi,
>>
>>     I managed to reproduce the behavior and as far as I can tell it
>>     seems to be a problem with the memory allocation.
>>
>>     I have filed a bug report in JIRA to get the attention of
>>     somebody who knows the runtime better than I do.
>>
>>     https://issues.apache.org/jira/browse/FLINK-1734
>>
>>     Cheers,
>>     Robert
>>
>>     On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru
>>     <vieru@informatik.hu-berlin.de
>>     <mailto:vieru@informatik.hu-berlin.de>> wrote:
>>
>>         Hi Robert,
>>
>>         thank you for your reply.
>>
>>         I'm starting the job from the Scala IDE. So only one
>>         JobManager and one TaskManager in the same JVM.
>>         I've doubled the memory in the eclipse.ini settings but I
>>         still get the Exception.
>>
>>         -vmargs
>>         -Xmx2048m
>>         -Xms100m
>>         -XX:MaxPermSize=512m
>>
>>         Best,
>>         Mihail
>>
>>
>>         On 17.03.2015 10:11, Robert Waury wrote:
>>>         Hi,
>>>
>>>         can you tell me how much memory your job has and how many
>>>         workers you are running?
>>>
>>>         From the trace it seems the internal hash table allocated
>>>         only 7 MB for the graph data and therefore runs out of
>>>         memory pretty quickly.
>>>
>>>         Skewed data could also be an issue but with a minimum of 5
>>>         pages and a maximum of 8 it seems to be distributed fairly
>>>         even to the different partitions.
>>>
>>>         Cheers,
>>>         Robert
>>>
>>>         On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru
>>>         <vieru@informatik.hu-berlin.de
>>>         <mailto:vieru@informatik.hu-berlin.de>> wrote:
>>>
>>>             And the correct SSSPUnweighted attached.
>>>
>>>
>>>             On 17.03.2015 01:23, Mihail Vieru wrote:
>>>
>>>                 Hi,
>>>
>>>                 I'm getting the following RuntimeException for an
>>>                 adaptation of the SingleSourceShortestPaths example
>>>                 using the Gelly API (see attachment). It's been
>>>                 adapted for unweighted graphs having vertices with
>>>                 Long values.
>>>
>>>                 As an input graph I'm using the social network graph
>>>                 (~200MB unpacked) from here:
>>>                 https://snap.stanford.edu/data/higgs-twitter.html
>>>
>>>                 For the small SSSPDataUnweighted graph (also
>>>                 attached) it terminates and computes the distances
>>>                 correctly.
>>>
>>>
>>>                 03/16/2015 17:18:23 IterationHead(WorksetIteration
>>>                 (Vertex-centric iteration
>>>                 (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4
>>>                 |
>>>                 org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4)
>>>                 switched to FAILED
>>>                 java.lang.RuntimeException: Memory ran out.
>>>                 Compaction failed. numPartitions: 32 minPartition: 5
>>>                 maxPartition: 8 number of overflow segments: 176
>>>                 bucketSize: 217 Overall memory: 20316160 Partition
>>>                 memory: 7208960 Message: Index: 8, Size: 7
>>>                     at
>>>                 org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
>>>                     at
>>>                 org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
>>>                     at
>>>                 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
>>>                     at
>>>                 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
>>>                     at
>>>                 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
>>>                     at
>>>                 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
>>>                     at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>>                 Best,
>>>                 Mihail
>>>
>>>
>>>
>>
>>
>
>


Mime
View raw message