giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vernon Thommeret <syno...@gmail.com>
Subject Re: Getting SimpleTriangleClosingVertex to run
Date Sun, 21 Oct 2012 19:57:48 GMT
Hey Eli,

Thanks for the suggestions. I've been playing with this nights and
weekends, which is why there's been such a delay :). I should have
more time in a couple weeks and will dig back in and report back.

Vernon

On Mon, Oct 8, 2012 at 9:00 PM, Eli Reisman <apache.mailbox@gmail.com> wrote:
> Brief follow-up:
>
> GIRAPH-314, which is not rebased or committed yet, is another part of this
> puzzle where I attempt to combine the messages and allow primitive (hacky)
> ability to amortize the supersteps where vertices message each other to keep
> the volume of messages down per-superstep. Its a blatant trade of time for
> space, and probably a desperate cry for help too. I will update it ASAP so
> you can play with it. I had pretty promising results but that was when I had
> a cluster to play with ;)
>
> First step, I'd try Maja's recipe for spill-to-disk during messaging. Her
> advice is in those 314-322-328 threads.
>
>
> On Mon, Oct 8, 2012 at 3:55 PM, Eli Reisman <apache.mailbox@gmail.com>
> wrote:
>>
>> I have had some trouble scaling it too, that is an issue I've been working
>> at from several angles for a few months now. The main problem is the
>> explosion of messaging that occurs.
>>
>> It might be worth trying to employ the spill-to-disk features, there was a
>> thread in the JIRA (I think for GIRAPH-328 or 322, maybe a bit earlier I can
>> check...) where Maja explained that the spill also halts computation when
>> messages build up so that we never quite overrun our memory reserves during
>> the computation/message stages. This trades time for space, but is something
>> I have been meaning to experiement with, as in many situations its a trade
>> well worth making. I will be experimenting with this option myself soon, its
>> on my "short list" of Giraph stuff-to-do!
>>
>> I am also independently working on some ways to deduplicate broadcast
>> messages such as those used in triangle closing so that in-memory runs of
>> this algorithm are possible at interesting scales. That idea has undergone
>> some "evolution" and is still underway, (its the aforementioned GIRAPH-322)
>> so more to follow there when my schoolwork lets up... ;)
>>
>> Eli
>>
>>
>>
>> On Sun, Oct 7, 2012 at 12:11 PM, Vernon Thommeret <synotic@gmail.com>
>> wrote:
>>>
>>> Thanks. I ended up getting it working. Having some issues scaling it,
>>> but working on it.
>>>
>>> On Mon, Sep 24, 2012 at 1:17 PM, Eli Reisman <apache.mailbox@gmail.com>
>>> wrote:
>>> > The io format types have to be compatible. Since
>>> > IdWithValueVertexOutputFormat does not specify the types it takes, it
>>> > just
>>> > attempts to output them as using the Writable interface, I use it to
>>> > output
>>> > data from the SimpleTriangleClosingVertex. I also had to write an
>>> > InputFormat to accept IntWritable id's and IntWritable out-edge
>>> > destinations. Otherwise, should work.
>>> >
>>> >
>>> >
>>> > On Mon, Sep 24, 2012 at 12:06 AM, Avery Ching <aching@apache.org>
>>> > wrote:
>>> >>
>>> >> I don't think the types are compatible.
>>> >>
>>> >> public class SimpleTriangleClosingVertex extends EdgeListVertex<
>>> >>   IntWritable, SimpleTriangleClosingVertex.IntArrayListWritable,
>>> >>   NullWritable, IntWritable>
>>> >>
>>> >> You'll need to use an input format and output format that fits these
>>> >> types.  Otherwise the issue is likely to be
>>> >> serialization/deserialization
>>> >> here.
>>> >>
>>> >>
>>> >> On 9/23/12 10:44 PM, Vernon Thommeret wrote:
>>> >>>
>>> >>> I'm trying to get the SimpleTriangleClosingVertex to run, but getting
>>> >>> this error:
>>> >>>
>>> >>> java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException:
>>> >>> IPC
>>> >>> server unable to read call parameters: null
>>> >>>         at
>>> >>>
>>> >>> org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:923)
>>> >>>         at
>>> >>>
>>> >>> org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:327)
>>> >>>         at
>>> >>>
>>> >>> org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604)
>>> >>>         at
>>> >>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:377)
>>> >>>         at
>>> >>> org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:578)
>>> >>>         at
>>> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>> >>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>> >>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>> >>>         at java.security.AccessController.doPrivileged(Native Method)
>>> >>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>>> >>>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>> >>> Caused by: org.apache.hadoop.ipc.RemoteException: IPC server
>>> >>>
>>> >>> This is the diff that causes the issue:
>>> >>>
>>> >>> @@ -33,7 +33,7 @@ import org.apache.hadoop.fs.Path;
>>> >>>   import org.apache.hadoop.io.IntWritable;
>>> >>>
>>> >>>   import org.apache.giraph.graph.GiraphJob;
>>> >>> -import org.apache.giraph.graph.IntIntNullIntVertex;
>>> >>> +import org.apache.giraph.examples.SimpleTriangleClosingVertex;
>>> >>>   import org.apache.giraph.io.IntIntNullIntTextInputFormat;
>>> >>>   import org.apache.giraph.io.AdjacencyListTextVertexOutputFormat;
>>> >>>
>>> >>> @@ -44,16 +44,12 @@ import org.apache.log4j.Logger;
>>> >>>   /**
>>> >>>    * Simple function to return the in degree for each vertex.
>>> >>>    */
>>> >>> -public class SharedConnectionsVertex extends IntIntNullIntVertex
>>> >>> implements Tool {
>>> >>> +public class SharedConnections implements Tool {
>>> >>>
>>> >>>     private Configuration conf;
>>> >>>     private static final Logger LOG =
>>> >>>         Logger.getLogger(SharedConnections.class);
>>> >>>
>>> >>> -  public void compute(Iterable<IntWritable> messages) {
>>> >>> -    voteToHalt();
>>> >>> -  }
>>> >>> -
>>> >>>     @Override
>>> >>>     public final int run(final String[] args) throws Exception {
>>> >>>       Options options = new Options();
>>> >>> @@ -71,7 +67,7 @@ public class SharedConnections extends
>>> >>> IntIntNullIntVertex implements Tool {
>>> >>>
>>> >>>       GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>>> >>>
>>> >>> -    job.setVertexClass(SharedConnections.class);
>>> >>> +    job.setVertexClass(SimpleTriangleClosingVertex.class);
>>> >>>
>>> >>> job.setVertexInputFormatClass(IntIntNullIntTextInputFormat.class);
>>> >>>
>>> >>>
>>> >>> job.setVertexOutputFormatClass(AdjacencyListTextVertexOutputFormat.class);
>>> >>>       job.setWorkerConfiguration(10, 10, 100.0f);
>>> >>>
>>> >>> --
>>> >>>
>>> >>> I.e. I have a dummy job that just outputs the vertices which works,
>>> >>> but trying to switch the vertex class doesn't seem to work. I'm
>>> >>> running the latest version of Giraph (rev 1388628). Should this
work
>>> >>> or should I try something different?
>>> >>>
>>> >>> Thanks!
>>> >>> Vernon
>>> >>
>>> >>
>>> >
>>
>>
>

Mime
View raw message