giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Reisman <apache.mail...@gmail.com>
Subject Re: Getting SimpleTriangleClosingVertex to run
Date Tue, 09 Oct 2012 01:00:39 GMT
Brief follow-up:

GIRAPH-314, which is not rebased or committed yet, is another part of this
puzzle where I attempt to combine the messages and allow primitive (hacky)
ability to amortize the supersteps where vertices message each other to
keep the volume of messages down per-superstep. Its a blatant trade of time
for space, and probably a desperate cry for help too. I will update it ASAP
so you can play with it. I had pretty promising results but that was when I
had a cluster to play with ;)

First step, I'd try Maja's recipe for spill-to-disk during messaging. Her
advice is in those 314-322-328 threads.

On Mon, Oct 8, 2012 at 3:55 PM, Eli Reisman <apache.mailbox@gmail.com>wrote:

> I have had some trouble scaling it too, that is an issue I've been working
> at from several angles for a few months now. The main problem is the
> explosion of messaging that occurs.
>
> It might be worth trying to employ the spill-to-disk features, there was a
> thread in the JIRA (I think for GIRAPH-328 or 322, maybe a bit earlier I
> can check...) where Maja explained that the spill also halts computation
> when messages build up so that we never quite overrun our memory reserves
> during the computation/message stages. This trades time for space, but is
> something I have been meaning to experiement with, as in many situations
> its a trade well worth making. I will be experimenting with this option
> myself soon, its on my "short list" of Giraph stuff-to-do!
>
> I am also independently working on some ways to deduplicate broadcast
> messages such as those used in triangle closing so that in-memory runs of
> this algorithm are possible at interesting scales. That idea has undergone
> some "evolution" and is still underway, (its the aforementioned GIRAPH-322)
> so more to follow there when my schoolwork lets up... ;)
>
> Eli
>
>
>
> On Sun, Oct 7, 2012 at 12:11 PM, Vernon Thommeret <synotic@gmail.com>wrote:
>
>> Thanks. I ended up getting it working. Having some issues scaling it,
>> but working on it.
>>
>> On Mon, Sep 24, 2012 at 1:17 PM, Eli Reisman <apache.mailbox@gmail.com>
>> wrote:
>> > The io format types have to be compatible. Since
>> > IdWithValueVertexOutputFormat does not specify the types it takes, it
>> just
>> > attempts to output them as using the Writable interface, I use it to
>> output
>> > data from the SimpleTriangleClosingVertex. I also had to write an
>> > InputFormat to accept IntWritable id's and IntWritable out-edge
>> > destinations. Otherwise, should work.
>> >
>> >
>> >
>> > On Mon, Sep 24, 2012 at 12:06 AM, Avery Ching <aching@apache.org>
>> wrote:
>> >>
>> >> I don't think the types are compatible.
>> >>
>> >> public class SimpleTriangleClosingVertex extends EdgeListVertex<
>> >>   IntWritable, SimpleTriangleClosingVertex.IntArrayListWritable,
>> >>   NullWritable, IntWritable>
>> >>
>> >> You'll need to use an input format and output format that fits these
>> >> types.  Otherwise the issue is likely to be
>> serialization/deserialization
>> >> here.
>> >>
>> >>
>> >> On 9/23/12 10:44 PM, Vernon Thommeret wrote:
>> >>>
>> >>> I'm trying to get the SimpleTriangleClosingVertex to run, but getting
>> >>> this error:
>> >>>
>> >>> java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException: IPC
>> >>> server unable to read call parameters: null
>> >>>         at
>> >>>
>> org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:923)
>> >>>         at
>> >>>
>> org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:327)
>> >>>         at
>> >>>
>> org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604)
>> >>>         at
>> >>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:377)
>> >>>         at
>> org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:578)
>> >>>         at
>> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>> >>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>> >>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>> >>>         at java.security.AccessController.doPrivileged(Native Method)
>> >>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>> >>>         at
>> >>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>> >>>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
>> >>> Caused by: org.apache.hadoop.ipc.RemoteException: IPC server
>> >>>
>> >>> This is the diff that causes the issue:
>> >>>
>> >>> @@ -33,7 +33,7 @@ import org.apache.hadoop.fs.Path;
>> >>>   import org.apache.hadoop.io.IntWritable;
>> >>>
>> >>>   import org.apache.giraph.graph.GiraphJob;
>> >>> -import org.apache.giraph.graph.IntIntNullIntVertex;
>> >>> +import org.apache.giraph.examples.SimpleTriangleClosingVertex;
>> >>>   import org.apache.giraph.io.IntIntNullIntTextInputFormat;
>> >>>   import org.apache.giraph.io.AdjacencyListTextVertexOutputFormat;
>> >>>
>> >>> @@ -44,16 +44,12 @@ import org.apache.log4j.Logger;
>> >>>   /**
>> >>>    * Simple function to return the in degree for each vertex.
>> >>>    */
>> >>> -public class SharedConnectionsVertex extends IntIntNullIntVertex
>> >>> implements Tool {
>> >>> +public class SharedConnections implements Tool {
>> >>>
>> >>>     private Configuration conf;
>> >>>     private static final Logger LOG =
>> >>>         Logger.getLogger(SharedConnections.class);
>> >>>
>> >>> -  public void compute(Iterable<IntWritable> messages) {
>> >>> -    voteToHalt();
>> >>> -  }
>> >>> -
>> >>>     @Override
>> >>>     public final int run(final String[] args) throws Exception {
>> >>>       Options options = new Options();
>> >>> @@ -71,7 +67,7 @@ public class SharedConnections extends
>> >>> IntIntNullIntVertex implements Tool {
>> >>>
>> >>>       GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>> >>>
>> >>> -    job.setVertexClass(SharedConnections.class);
>> >>> +    job.setVertexClass(SimpleTriangleClosingVertex.class);
>> >>>
>> job.setVertexInputFormatClass(IntIntNullIntTextInputFormat.class);
>> >>>
>> >>>
>> job.setVertexOutputFormatClass(AdjacencyListTextVertexOutputFormat.class);
>> >>>       job.setWorkerConfiguration(10, 10, 100.0f);
>> >>>
>> >>> --
>> >>>
>> >>> I.e. I have a dummy job that just outputs the vertices which works,
>> >>> but trying to switch the vertex class doesn't seem to work. I'm
>> >>> running the latest version of Giraph (rev 1388628). Should this work
>> >>> or should I try something different?
>> >>>
>> >>> Thanks!
>> >>> Vernon
>> >>
>> >>
>> >
>>
>
>

Mime
View raw message