giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo" <majakabi...@fb.com>
Subject Re: Review Request: GIRAPH-588: More flexible Hive input
Date Wed, 27 Mar 2013 16:20:18 GMT


> On March 27, 2013, 3:18 a.m., Nitay Joffe wrote:
> > I love where this is going, awesome work! The only thing that bugs me a bit is all
the wrappers - it makes the code a bit hard to understand so I worry about new users. Could
we get rid of some of the wrappers if we carry the Iterator<> interface up through to
Edge/VertexReader themselves.

I'm not sure what do you mean. As we discussed offline, eventually we can have all readers
implementing Iterator, but that's a big change so we can make it in several parts.
Also, wrappers are not visible to the user, he should be working only with HiveToEdge/SimpleHiveToEdge
etc.


> On March 27, 2013, 3:18 a.m., Nitay Joffe wrote:
> > giraph-hive/src/main/java/org/apache/giraph/hive/input/RecordReaderWrapper.java,
line 34
> > <https://reviews.apache.org/r/10126/diff/1/?file=274498#file274498line34>
> >
> >     Look into Guava's AbstractIterator, it might make this easier/cleaner to implement?

Awesome, didn't know about that class. Much cleaner :-)


> On March 27, 2013, 3:18 a.m., Nitay Joffe wrote:
> > giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/AbstractHiveToEdge.java,
line 37
> > <https://reviews.apache.org/r/10126/diff/1/?file=274499#file274499line37>
> >
> >     Maybe add default implementation of remove() that throws UnsupportedOperationException
so others don't have to have the method unless they really want to. Could also make it final
if we don't want to allow removal ever?

Good idea


> On March 27, 2013, 3:18 a.m., Nitay Joffe wrote:
> > giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveToVertex.java,
line 41
> > <https://reviews.apache.org/r/10126/diff/1/?file=274507#file274507line41>
> >
> >     I take it you removed HiveToVertexValue because you're going to essentially
reimplement that with a HiveToEdge that reads single row => multiple edges? Cool!

You mean HiveToVertexEdges? No, that would be more expensive. From what I understood in the
current implementation, HiveToVertexEdges is just an extension to HiveToVertexValue. Now HiveTovertex
can read both values and edges, and SimpleNoEdgesHiveToVertex is for the cases when we have
no edges.


- Maja


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10126/#review18414
-----------------------------------------------------------


On March 27, 2013, 4:19 p.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10126/
> -----------------------------------------------------------
> 
> (Updated March 27, 2013, 4:19 p.m.)
> 
> 
> Review request for giraph and Nitay Joffe.
> 
> 
> Description
> -------
> 
> Currently with Hive input formats it's only possible to convert single row to single
vertex/edge. We should support some rows to be skipped. General multiple rows to multiple
vertices/edges conversion can be useful (one example - we have edge input, sorted by source
id, and we want to keep only top k weighted edges per vertex).
> 
> Added some wrappers for readers to convert them to Iterator-like classes.
> 
> 
> This addresses bug GIRAPH-588.
>     https://issues.apache.org/jira/browse/GIRAPH-588
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/EdgeReaderWrapper.java PRE-CREATION

>   giraph-core/src/main/java/org/apache/giraph/io/iterables/EdgeWithSource.java PRE-CREATION

>   giraph-core/src/main/java/org/apache/giraph/io/iterables/GiraphReader.java PRE-CREATION

>   giraph-core/src/main/java/org/apache/giraph/io/iterables/IteratorToReaderWrapper.java
PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/VertexReaderWrapper.java PRE-CREATION

>   giraph-core/src/main/java/org/apache/giraph/io/iterables/package-info.java PRE-CREATION

>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java efc08d3 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/DefaultConfigurableAndTableSchemaAware.java
PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/RecordReaderWrapper.java PRE-CREATION

>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/AbstractHiveToEdge.java
f29fea7 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java
18b40c2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 6fb183a

>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveToEdge.java 2205b82

>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/SimpleHiveToEdge.java PRE-CREATION

>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/AbstractHiveToVertex.java
PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/AbstractHiveToVertexEdges.java
d0668f6 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/AbstractHiveToVertexValue.java
9ab316f 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveToVertex.java PRE-CREATION

>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveToVertexEdges.java
8076a8a 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveToVertexValue.java
593eb9a 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java
25c7a26 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexReader.java
541176f 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/SimpleHiveToVertex.java
PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/SimpleNoEdgesHiveToVertex.java
PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/output/AbstractVertexToHive.java 8e3f1ca

> 
> Diff: https://reviews.apache.org/r/10126/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message