giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Presta" <alessan...@fb.com>
Subject Re: Review Request: Move part of the graph out-of-core when memory is low
Date Tue, 21 Aug 2012 22:42:29 GMT


> On Aug. 21, 2012, 4:31 p.m., Maja Kabiljo wrote:
> > Nice work, Alessandro.
> > 
> > Have you checked from which places in the code getPartition is called? For example,
from the quick look, one of the things which worries me is in NettyWorkerServer.prepareSuperstep,
line 118. service.getVertex() ends up calling it, and the order in which vertices are visited
here is not sorted by partition. So you could possibly end up reading and writing the same
partition to disk many times. Did you maybe do some testing to see how many times are the
partitions actually swapped to/from disk during a superstep? 
> > (the one which I mentioned could be fixed by adding hasVertexId() to PartitionStore,
since we just check if the vertex exists at that place)

I've analyzed the calls to getPartition(s), and besides the compute loop, we have the one
you mentioned and the loop that initializes partition stats at the end of input superstep.

For the former, the only way is to add a set of vertex ids to PartitionStore, but this makes
it less scalable, as then we have to assume that we can store all the vertex ids; on the other
hand, MessageStore makes this very same assumption. Another problem with keeping track of
vertex ids is that we have some code that calls Partition#putVertex() directly. Also, addPartition()
would then have to read all vertex ids even when the partition is in memory, which would make
it way slower in the standard use case.

For the latter, we already keep track of the number of vertices for out-of-core partitions,
but doing the same for edges presents simular issues, plus it doesn't play well with mutations
(which is why Partition#getEdgeCount() currently computes the sum on the fly).

In other words, yes, there are other potential savings, but not easy as they may seem.
As I've said before, I think we need to discuss substantial design changes before we can take
these approaches to their full extent.

Anyway, for the record, here's the log of a full superstep for a worker in a PageRank job
with 10 workers, 5 partitions in memory out of 10. The graph is a clique so there are always
messages for each partition:
http://pastebin.com/raw.php?i=igYsAJGT
I can see 6 read/write operations by DiskBackedPartitionStore, so it doesn't look that bad.


> On Aug. 21, 2012, 4:31 p.m., Maja Kabiljo wrote:
> > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java,
lines 105-107
> > <https://reviews.apache.org/r/5987/diff/5/?file=143400#file143400line105>
> >
> >     This is not used anymore.

Oh, thanks!


> On Aug. 21, 2012, 4:31 p.m., Maja Kabiljo wrote:
> > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/PartitionStore.java,
line 54
> > <https://reviews.apache.org/r/5987/diff/5/?file=143406#file143406line54>
> >
> >     Typo.

Whoops, fixed!


> On Aug. 21, 2012, 4:31 p.m., Maja Kabiljo wrote:
> > http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/DiskBackedPartitionStore.java,
line 116
> > <https://reviews.apache.org/r/5987/diff/5/?file=143403#file143403line116>
> >
> >     I think you want to do this before you put it in the map. Like this someone
can take it and lock it before the thread creating it does.
> >     Could you also say which assumptions are you making for concurrency in this
code so it would be easier to review (i.e. which operations can happen in parallel and which
can't)?

Well spotted, many thanks!

The assumption is that addPartition() and addPartitionVertices() are called in a multi-threaded
context (processing incoming vertex requests) whereas getPartition(), removePartition() and
similar are called in a sequential context (e.g. iterating through partitions to call compute).


- Alessandro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5987/#review10579
-----------------------------------------------------------


On Aug. 21, 2012, 11:16 a.m., Alessandro Presta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5987/
> -----------------------------------------------------------
> 
> (Updated Aug. 21, 2012, 11:16 a.m.)
> 
> 
> Review request for giraph and Avery Ching.
> 
> 
> Description
> -------
> 
> I gave this another shot. This time it plays nicely with input superstep: I replaced
both the temporary partitions and the worker partition map with a common data structure, PartitionStore,
held in ServerData. This is similar to what we now have for messages.
> 
> Partition is now thread-safe so that we can have concurrent calls to putVertex().
> 
> SimplePartitionStore is backed by a concurrent hash map (nothing new here, except that
we skip copying partitions to the worker).
> 
> DiskBackedPartitionStore can hold up to a user-defined number of partitions in memory,
and spill the remaining ones to disk. Each partition is stored in a separate file.
> Adding vertices to an out-of-core partition consists in appending them to the file, which
makes processing vertex requests relatively fast.
> We use a ReadWriteLock for each partition: performing operations on a partition held
in memory only requires a read-lock (since Partition is thread-safe), while creating a new
partition, moving it in and out of core or appending vertices requires a write-lock (we can't
have concurrent writes).
> 
> Also note that this breaks Hadoop RPC: I preferred to keep it clean (this also shows
what code we get rid of) instead of working around it. I suppose the Netty security patch
will be completed soon. If not, I will restore RPC compatibility.
> 
> More here: https://issues.apache.org/jira/browse/GIRAPH-249?focusedCommentId=13435280&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13435280
> 
> 
> This addresses bug GIRAPH-249.
>     https://issues.apache.org/jira/browse/GIRAPH-249
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClientServer.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendVertexRequest.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/ServerData.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/WorkerServer.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/DiskBackedPartitionStore.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/Partition.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/PartitionStore.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/SimplePartitionStore.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/WorkerGraphPartitioner.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/TestMutateGraphVertex.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/ConnectionTest.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/RequestTest.java
1375453 
>   http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/graph/partition/TestPartitionStores.java
PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/5987/diff/
> 
> 
> Testing
> -------
> 
> mvn verify and pseudo-distributed mode tests with both SimplePartitionStore and DiskBackedPartitionStore
> 
> 
> Thanks,
> 
> Alessandro Presta
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message