giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Presta" <alessan...@fb.com>
Subject Re: Review Request: Allow creation of graph by adding edges that span multiple workers
Date Fri, 02 Nov 2012 04:06:45 GMT


> On Oct. 31, 2012, 8:43 p.m., Nitay Joffe wrote:
> > This is really cool stuff Alessandro. Logic and design wise it looks sweet to me.
> > The main comments I have are around too much copy/pasting. There are many places
where I think the copy/paste can be cleaned up fairly easily. See detailed comments for examples.
> >

About to update the diff with all the suggested cleanup to minimize code duplication. Had
to introduce a factory :/


> On Oct. 31, 2012, 8:43 p.m., Nitay Joffe wrote:
> > /trunk/giraph/src/main/java/org/apache/giraph/graph/VertexInputSplitsCallable.java,
line 1
> > <https://reviews.apache.org/r/7784/diff/1/?file=183025#file183025line1>
> >
> >     I'm assuming this is same as InputSplitsCallable as before right?

Yes


> On Oct. 31, 2012, 8:43 p.m., Nitay Joffe wrote:
> > /trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java, line
295
> > <https://reviews.apache.org/r/7784/diff/1/?file=183014#file183014line295>
> >
> >     Let's not be lazy. MakeĀ a common function with generateVertexInputSplits()
and call it here. Copy/paste is bad form and has bitten us many times before.

Fixed by adding a common interface to VertexInputFormat and EdgeInputFormat that defines getSplits().


> On Oct. 31, 2012, 8:43 p.m., Nitay Joffe wrote:
> > /trunk/giraph/src/main/java/org/apache/giraph/graph/BspService.java, line 219
> > <https://reviews.apache.org/r/7784/diff/1/?file=183013#file183013line219>
> >
> >     I see a lot of duplication here. If these four and the previous four paths for
vertexes are going to be kept around for the long term (not just some intermediate thing while
we're developing quickly), how about we stick them in a class? Then have one instance for
vertexes and one for edges? Something like zkInputSplitPaths?

Good idea, thanks!


> On Oct. 31, 2012, 8:43 p.m., Nitay Joffe wrote:
> > /trunk/giraph/src/main/java/org/apache/giraph/graph/EdgeReader.java, line 29
> > <https://reviews.apache.org/r/7784/diff/1/?file=183019#file183019line29>
> >
> >     s/vertices/edges ?

/g


> On Oct. 31, 2012, 8:43 p.m., Nitay Joffe wrote:
> > /trunk/giraph/src/main/java/org/apache/giraph/io/IntIntTextVertexValueInputFormat.java,
line 60
> > <https://reviews.apache.org/r/7784/diff/1/?file=183032#file183032line60>
> >
> >     Guava's stuff might actually be faster / better for this? Splitter TAB_SPLITTER
= Splitter.on("\t") up top, then
> >     TAB_SPLITTER.split(line.toString()) here?

Do you have evidence that it's significantly faster in this case?
Splitter produces an Iterable<String>, which makes it a bit annoying to access the first
two elements.


- Alessandro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7784/#review12969
-----------------------------------------------------------


On Oct. 30, 2012, 11:44 p.m., Alessandro Presta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7784/
> -----------------------------------------------------------
> 
> (Updated Oct. 30, 2012, 11:44 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Description
> -------
> 
> See https://issues.apache.org/jira/browse/GIRAPH-155
> 
> 
> This addresses bug GIRAPH-155.
>     https://issues.apache.org/jira/browse/GIRAPH-155
> 
> 
> Diffs
> -----
> 
>   /trunk/giraph/src/main/java/org/apache/giraph/GiraphConfiguration.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/ImmutableClassesGiraphConfiguration.java
1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java 1403451

>   /trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerServer.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java
1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerServer.java 1403451

>   /trunk/giraph/src/main/java/org/apache/giraph/comm/requests/SendPartitionMutationsRequest.java
1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BasicVertexValueReader.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BspService.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BspUtils.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/EdgeInputFormat.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/graph/EdgeInputSplitsCallable.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/graph/EdgeReader.java PRE-CREATION 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/EdgeWithSource.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/graph/GiraphJob.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/InputSplitsCallable.java 1403451

>   /trunk/giraph/src/main/java/org/apache/giraph/graph/MasterThread.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/VertexInputSplitsCallable.java
PRE-CREATION 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/VertexReader.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/VertexResolver.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/VertexValueInputFormat.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/graph/VertexValueReader.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/io/GiraphFileInputFormat.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/io/GiraphTextInputFormat.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/io/IntIntTextVertexValueInputFormat.java
PRE-CREATION 
>   /trunk/giraph/src/main/java/org/apache/giraph/io/IntNullTextEdgeInputFormat.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/io/TextEdgeInputFormat.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/io/TextVertexInputFormat.java 1403451

>   /trunk/giraph/src/main/java/org/apache/giraph/io/TextVertexValueInputFormat.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/utils/FileUtils.java 1403451 
>   /trunk/giraph/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 1403451

>   /trunk/giraph/src/test/java/org/apache/giraph/BspCase.java 1403451 
>   /trunk/giraph/src/test/java/org/apache/giraph/TestEdgeInput.java PRE-CREATION 
>   /trunk/giraph/src/test/java/org/apache/giraph/TestJsonBase64Format.java 1403451 
> 
> Diff: https://reviews.apache.org/r/7784/diff/
> 
> 
> Testing
> -------
> 
> 1) mvn verify
> 2) pseudo-distributed tests
> 3) PageRank on cluster with real text input (will post perf results on the issue)
> 
> 
> Thanks,
> 
> Alessandro Presta
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message