giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Presta" <alessan...@fb.com>
Subject Re: Review Request: In memory testing framework added
Date Wed, 20 Mar 2013 22:56:01 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10052/#review18177
-----------------------------------------------------------


Thanks for taking the time to work on this!
Please see the inline comments.
My main concern is that I don't see how we get edge values. It seems to me that there should
be no need for InMemoryEdgeInput, and instead we should store edge values in TestGraph (probably
adding an addEdge() method too).
Also, I think this functionality could be achieved more simply by storing the Vertex objects
in TestGraph (probably in a Map<I, Vertex>), and having InMemoryVertexInput simply return
references to those vertices. There will be no need for InMemoryVertexOutput, as the vertices
will be modified in-place and InternalVertexRunner#run() will just return the same TestGraph.


giraph-core/src/main/java/org/apache/giraph/utils/InMemoryEdgeInput.java
<https://reviews.apache.org/r/10052/#comment38346>

    Usually we put this all in one line.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryEdgeInput.java
<https://reviews.apache.org/r/10052/#comment38349>

    Why is this class needed anyway? Isn't a VertexInputFormat sufficient once we have built
our TestGraph? You just need to handle edge values in there too.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryEdgeInput.java
<https://reviews.apache.org/r/10052/#comment38347>

    Why do you need your own DummyInputSplit? Can't you use BspInputSplit like GeneratedVertexInputFormat
does?



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryEdgeInput.java
<https://reviews.apache.org/r/10052/#comment38359>

    Why is this in terms of WritableComparable and Writable instead of I and E?



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryEdgeInput.java
<https://reviews.apache.org/r/10052/#comment38348>

    What's going on with this iterator? You get it from an empty list. Later you have a commented
assignment.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInput.java
<https://reviews.apache.org/r/10052/#comment38350>

    See above.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInput.java
<https://reviews.apache.org/r/10052/#comment38351>

    See above.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInput.java
<https://reviews.apache.org/r/10052/#comment38352>

    I would call this "getEdges", maybe "createEdges". Also, looks like it could be converted
to a static method.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInput.java
<https://reviews.apache.org/r/10052/#comment38353>

    Why are we assuming there are no edge values?



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInput.java
<https://reviews.apache.org/r/10052/#comment38354>

    Leftover.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInput.java
<https://reviews.apache.org/r/10052/#comment38355>

    Assigning the format's conf from the reader doesn't feel right.
    You can instead make the format implement ImmutableClassesGiraphConfigurable, so that
Giraph will provide it with the conf upon instantiation.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexOutput.java
<https://reviews.apache.org/r/10052/#comment38356>

    See above.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexOutput.java
<https://reviews.apache.org/r/10052/#comment38360>

    I think this would be easier to use if it returned a Map<I, Vertex> instead. That
way we can easily look up just the vertices we're interested in, as well as iterate if we
want to check a condition against all vertices.
    
    Even better would be to have TestGraph itself implement a sort of map-like interface providing
lookup and iteration over values, and skip the output format entirely.



giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexOutput.java
<https://reviews.apache.org/r/10052/#comment38357>

    There's a class called ImmutableOutputCommitter (not sure why it's called like that) that's
the same as this anonymous class.



giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java
<https://reviews.apache.org/r/10052/#comment38358>

    This comment is not completely relevant, since there is no input file or output folder,
right?



giraph-core/src/main/java/org/apache/giraph/utils/TestGraph.java
<https://reviews.apache.org/r/10052/#comment38366>

    This is really a container of vertex values, not vertices.



giraph-core/src/main/java/org/apache/giraph/utils/TestGraph.java
<https://reviews.apache.org/r/10052/#comment38364>

    I think we should be able to skip this intermediate representation, and directly store
the vertices in a Map<I, Vertex>.
    Then, InMemoryVertexInput can simply return references to the already-created vertices
in TestGraph.
    No output format will be needed then, as the vertices will be modified in place.
    That way InternalVertexRunner#run() can simply return the same TestGraph reference, and
a user can do any checks directly against that.



giraph-core/src/main/java/org/apache/giraph/utils/TestGraph.java
<https://reviews.apache.org/r/10052/#comment38367>

    This is really a container of neighbor ids, not edges.



giraph-core/src/main/java/org/apache/giraph/utils/TestGraph.java
<https://reviews.apache.org/r/10052/#comment38370>

    Can you switch to more descriptive parameter names, as in the rest of the codebase?
    In this case, since it's clear that we're talking about a vertex, I'd be happy with "id"
and "value" instead of "i" and "v".



giraph-core/src/main/java/org/apache/giraph/utils/TestGraph.java
<https://reviews.apache.org/r/10052/#comment38369>

    It would be useful to also have an addEdge(I sourceVertexId, targetVertexId) method.
    If the vertex exists, it should add the edge. If the vertex doesn't exist, it should create
it with a default value and then add the edge.



giraph-core/src/main/java/org/apache/giraph/utils/TestGraph.java
<https://reviews.apache.org/r/10052/#comment38365>

    This is a bit misleading because it doesn't return an iterator over the edges, but over
the neighbor ids.
    Also, unless you need it to be an ArrayList in the caller, you can return a more generic
Iterable<I>.



giraph-examples/src/test/java/org/apache/giraph/examples/ConnectedComponentsVertexTestInMemory.java
<https://reviews.apache.org/r/10052/#comment38361>

    Trailing space.



giraph-examples/src/test/java/org/apache/giraph/examples/ConnectedComponentsVertexTestInMemory.java
<https://reviews.apache.org/r/10052/#comment38368>

    Can't you rewrite this more compactly as a chain of calls?
    
    graph.addVertex(...)
      .addVertex(...)
      .addVertex(...)
      ...



giraph-examples/src/test/java/org/apache/giraph/examples/ConnectedComponentsVertexTestInMemory.java
<https://reviews.apache.org/r/10052/#comment38362>

    Trailing spaces.


- Alessandro Presta


On March 20, 2013, 10:05 p.m., Veselin Stoyanov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10052/
> -----------------------------------------------------------
> 
> (Updated March 20, 2013, 10:05 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Description
> -------
> 
> A framework for running small test graphs in memory.
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/utils/InMemoryEdgeInput.java PRE-CREATION

>   giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInput.java PRE-CREATION

>   giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexOutput.java PRE-CREATION

>   giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 7e0b955998a5f1890912819fa2ca74fbbf46fedd

>   giraph-core/src/main/java/org/apache/giraph/utils/TestGraph.java PRE-CREATION 
>   giraph-examples/src/test/java/org/apache/giraph/examples/ConnectedComponentsVertexTestInMemory.java
PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10052/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Veselin Stoyanov
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message