giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Presta <alessan...@fb.com>
Subject Re: [jira] [Commented] (GIRAPH-249) Move part of the graph out-of-core when memory is low
Date Fri, 17 Aug 2012 13:54:38 GMT
I think at that point it would be worth having a new logical place for
vertex/edge representation at worker- or partition-level.
Avery had some ideas about this.

Basically right now we're giving the user the freedom (and responsibility)
to choose a representation (both in-memory and for serialization), but
another way to go would be to take care of all that at infrastructure
level and expose only one Vertex class (where the user only defines the
computation details and everything else is abstracted away). Then we could
play around with compact representations and even more disruptive
strategies (like streaming the graph/messages and re-using objects).

On 8/17/12 2:30 PM, "Gianmarco De Francisci Morales" <gdfm@apache.org>
wrote:

>I was under the impression that 100k was the upper limit to make things
>work without crashing.
>
>In any case, if one wanted to use a compressed memory representation by
>aggregating different edge lists together, could one use the worker
>context
>as a central point of access to the compressed graphs?
>I can imagine a vertex class that has only the ID and uses the worker
>context to access its edge list (i.e. it is only a client to a central
>per-machine repository).
>Vertexes in the same partition would share this data structure.
>
>Is there any obvious technical fallacy in this scheme?
>
>Cheers,
>--
>Gianmarco
>
>
>
>On Fri, Aug 17, 2012 at 3:18 PM, Alessandro Presta
><alessandro@fb.com>wrote:
>
>> The example where we actually go out of memory was with 500K vertices
>>and
>> 500M edges, but yes, as a general rule we should strive to reduce our
>> memory footprint in order to push the point where we need to go out of
>> core as far away as possible.
>>
>> On 8/17/12 2:11 PM, "Gianmarco De Francisci Morales" <gdfm@apache.org>
>> wrote:
>>
>> >Very interesting.
>> >
>> >On a side note, a graph with 100k vertices and 100M edges is largish
>>but
>> >not that big after all.
>> >If it does not fit on 10+ GB of memory, it means that each edge
>>occupies
>> >around 100B (amortizing the cost of the vertex over the edges).
>> >In my opinion this deserves some thought.
>> >If memory is an issue, why not think about compressed memory
>>structures,
>> >at
>> >least for common graph formats?
>> >
>> >Cheers,
>> >--
>> >Gianmarco
>> >
>> >
>> >
>> >On Wed, Aug 15, 2012 at 11:20 PM, Eli Reisman
>> ><initialcontext@gmail.com>wrote:
>> >
>> >> Great metrics, this made a very interesting read, and great code too
>>as
>> >> always. This must have been a lot of work. I like the idea of
>> >>eliminating
>> >> the extra temporary storage data structures where possible, even when
>> >>not
>> >> going out-of-core. I think that + avoiding extra object creation
>>during
>> >>the
>> >> workflow can still do a lot for in-core job's memory profile, but
>>this
>> >>is
>> >> looking really good and sounds like with the config options its also
>> >> pluggable depending on your hardware situation, so it sounds great to
>> >>me.
>> >> Great work!
>> >>
>> >> On Wed, Aug 15, 2012 at 12:23 PM, Alessandro Presta (JIRA)
>> >> <jira@apache.org>wrote:
>> >>
>> >> >
>> >> >     [
>> >> >
>> >>
>> 
>>>>https://issues.apache.org/jira/browse/GIRAPH-249?page=com.atlassian.jir
>>>>a
>> .
>> 
>>>>plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435437
>>>>#c
>> >>omment-13435437
>> >> ]
>> >> >
>> >> > Alessandro Presta commented on GIRAPH-249:
>> >> > ------------------------------------------
>> >> >
>> >> > Thanks Claudio, good observation.
>> >> > You got me curious so I quickly ran a shortest paths benchmark.
>> >> >
>> >> > 500k vertices, 100 edges/vertex, 10 workers
>> >> >
>> >> > This is with trunk:
>> >> >
>> >> > {code}
>> >> > hadoop jar giraph-trunk.jar
>> >> > org.apache.giraph.benchmark.ShortestPathsBenchmark
>> >>-Dgiraph.useNetty=true
>> >> > -v -V 500000 -e 100 -w 10
>> >> > {code}
>> >> >
>> >> > {code}
>> >> > Superstep 3 (milliseconds)      5,394
>> >> > Superstep 4 (milliseconds)      5,650
>> >> > Superstep 23 (milliseconds)     1,100
>> >> > Superstep 20 (milliseconds)     1,108
>> >> > Superstep 31 (milliseconds)     1,192
>> >> > Setup (milliseconds)    3,297
>> >> > Shutdown (milliseconds) 2,309
>> >> > Superstep 27 (milliseconds)     1,135
>> >> > Superstep 7 (milliseconds)      4,476
>> >> > Superstep 9 (milliseconds)      3,400
>> >> > Superstep 8 (milliseconds)      4,043
>> >> > Superstep 16 (milliseconds)     1,184
>> >> > Superstep 14 (milliseconds)     2,251
>> >> > Superstep 6 (milliseconds)      5,150
>> >> > Superstep 24 (milliseconds)     1,167
>> >> > Superstep 18 (milliseconds)     1,176
>> >> > Superstep 5 (milliseconds)      5,483
>> >> > Superstep 1 (milliseconds)      1,125
>> >> > Superstep 21 (milliseconds)     1,192
>> >> > Total (milliseconds)    85,757
>> >> > Superstep 15 (milliseconds)     1,375
>> >> > Superstep 22 (milliseconds)     1,159
>> >> > Vertex input superstep (milliseconds)   11,644
>> >> > Superstep 25 (milliseconds)     1,058
>> >> > Superstep 17 (milliseconds)     1,075
>> >> > Superstep 26 (milliseconds)     1,051
>> >> > Superstep 12 (milliseconds)     2,342
>> >> > Superstep 10 (milliseconds)     3,192
>> >> > Superstep 19 (milliseconds)     1,092
>> >> > Superstep 11 (milliseconds)     2,533
>> >> > Superstep 30 (milliseconds)     1,126
>> >> > Superstep 0 (milliseconds)      821
>> >> > Superstep 28 (milliseconds)     1,184
>> >> > Superstep 29 (milliseconds)     1,116
>> >> > Superstep 2 (milliseconds)      1,165
>> >> > Superstep 13 (milliseconds)     1,983
>> >> > {code}
>> >> >
>> >> > And with 5 partitions out-of-core:
>> >> >
>> >> > {code}
>> >> > hadoop jar giraph-249.jar
>> >> > org.apache.giraph.benchmark.ShortestPathsBenchmark
>> >>-Dgiraph.useNetty=true
>> >> > -Dgiraph.useOutOfCoreGraph=true -Dgiraph.maxPartitionsInMemory=5
>>-v -V
>> >> > 500000 -e 100 -w 10
>> >> > {code}
>> >> >
>> >> > {code}
>> >> > Superstep 3 (milliseconds)      27,407
>> >> > Superstep 4 (milliseconds)      26,620
>> >> > Superstep 23 (milliseconds)     20,906
>> >> > Superstep 20 (milliseconds)     21,324
>> >> > Superstep 31 (milliseconds)     21,055
>> >> > Setup (milliseconds)    2,639
>> >> > Superstep 7 (milliseconds)      25,819
>> >> > Superstep 27 (milliseconds)     20,790
>> >> > Shutdown (milliseconds) 175
>> >> > Superstep 16 (milliseconds)     21,434
>> >> > Superstep 8 (milliseconds)      24,434
>> >> > Superstep 9 (milliseconds)      24,183
>> >> > Superstep 14 (milliseconds)     22,401
>> >> > Superstep 6 (milliseconds)      25,948
>> >> > Superstep 24 (milliseconds)     20,968
>> >> > Superstep 18 (milliseconds)     21,179
>> >> > Superstep 5 (milliseconds)      27,134
>> >> > Superstep 1 (milliseconds)      20,315
>> >> > Superstep 21 (milliseconds)     21,442
>> >> > Total (milliseconds)    729,459
>> >> > Superstep 15 (milliseconds)     22,198
>> >> > Superstep 22 (milliseconds)     20,875
>> >> > Vertex input superstep (milliseconds)   19,595
>> >> > Superstep 25 (milliseconds)     20,829
>> >> > Superstep 17 (milliseconds)     21,617
>> >> > Superstep 12 (milliseconds)     22,548
>> >> > Superstep 26 (milliseconds)     20,763
>> >> > Superstep 19 (milliseconds)     21,302
>> >> > Superstep 10 (milliseconds)     23,823
>> >> > Superstep 11 (milliseconds)     22,908
>> >> > Superstep 0 (milliseconds)      10,836
>> >> > Superstep 30 (milliseconds)     21,014
>> >> > Superstep 28 (milliseconds)     21,109
>> >> > Superstep 29 (milliseconds)     21,158
>> >> > Superstep 13 (milliseconds)     21,974
>> >> > Superstep 2 (milliseconds)      20,726
>> >> > {code}
>> >> >
>> >> > So yeah, this is more like an order of magnitude. Of course this is
>> >> > nothing scientific.
>> >> >
>> >> > > Move part of the graph out-of-core when memory is low
>> >> > > -----------------------------------------------------
>> >> > >
>> >> > >                 Key: GIRAPH-249
>> >> > >                 URL:
>> >>https://issues.apache.org/jira/browse/GIRAPH-249
>> >> > >             Project: Giraph
>> >> > >          Issue Type: Improvement
>> >> > >            Reporter: Alessandro Presta
>> >> > >            Assignee: Alessandro Presta
>> >> > >         Attachments: GIRAPH-249.patch, GIRAPH-249.patch,
>> >> > GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch,
>>GIRAPH-249.patch
>> >> > >
>> >> > >
>> >> > > There has been some talk about Giraph's scaling limitations due
>>to
>> >> > keeping the whole graph and messages in RAM.
>> >> > > We need to investigate methods to fall back to disk when running
>> >>out of
>> >> > memory, while gracefully degrading performance.
>> >> > > This issue is for graph storage. Messages should probably be a
>> >>separate
>> >> > issue, although the interplay between the two is crucial.
>> >> > > We should also discuss what are our primary goals here:
>>completing a
>> >> job
>> >> > (albeit slowly) instead of failing when the graph is too big, while
>> >>still
>> >> > encouraging memory optimizations and high-memory clusters; or
>> >> restructuring
>> >> > Giraph to be as efficient as possible in disk mode, making it
>>almost a
>> >> > standard way of operating.
>> >> >
>> >> > --
>> >> > This message is automatically generated by JIRA.
>> >> > If you think it was sent incorrectly, please contact your JIRA
>> >> > administrators:
>> >> >
>> 
>>>>https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp
>>>>a
>> >> > For more information on JIRA, see:
>> >> http://www.atlassian.com/software/jira
>> >> >
>> >> >
>> >> >
>> >>
>>
>>


Mime
View raw message