giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avery Ching" <avery.ch...@gmail.com>
Subject Re: Review Request: Input/output formats and readers/writers should implement ImmutableClassesGiraphConfigurable
Date Fri, 05 Apr 2013 19:21:04 GMT


> On April 5, 2013, 6:17 p.m., Alessandro Presta wrote:
> > My idea was actually to make the base classes (VertexReader etc) default-configurable,
exactly to avoid the code duplication seen here. They will have to be turned from interfaces
to abstract base classes (hopefully we don't have any code that relies on them being interfaces).
What do you think?
> > Also, are you sure the input formats actually need to be configurable? Only the
readers/writers need to access special Giraph methods to instantiate vertices etc. The input
formats generally just need to read some options, and that can be done from the context passed
in initialize().
> > Same for AggregatorWriter, it doesn't seem to need any of the special methods.
> 
> Avery Ching wrote:
>     Changing the interfaces to abstract classes that are default configurable for readers/writers
is reasonable.  
>     
>     I think for flexibility it would be good to make the input formats configurable (i.e.
getSplits could depend on some configuration).
>     
>     I prefer us to use ImmutableClassesGiraphConfiguration as opposed to plain Configuration
just to make it easier for the user to get Giraph-specific configuration and it doesn't hurt
anything.  
>     
>     How does that sound?
>     
>     I'll start making the changes for the interfaces -> abstract classes immediately
though.
> 
> Alessandro Presta wrote:
>     What's an example of Giraph-specific configuration needed in an input format?
>     If there isn't one, I'm still of the opinion that there's no point in changing (isn't
the whole point of ICGC that it makes instantiating certain user-defined classes faster?).
>     Anyway my major point is reducing code duplication by making that interface ->
abstract class change, assuming it is doable.
> 
> Avery Ching wrote:
>     ICGC doesn't only provide classes, it also provides access to the huge list of options
in GiraphConfiguration.
> 
> Alessandro Presta wrote:
>     Then we should provide GiraphConfiguration instead (or better yet, merge the two,
which would be less confusing).
>     Anyway I think Giraph-specific options are meant to be used by Giraph internals,
not by user-defined classes. Maybe I'm wrong on this, but then please provide a concrete example.

ImmutableClassesGiraphConfiguration is just an optimized GiraphConfiguration for classes.
 Let's think about merging/splitting functionality in another JIRA.  

I can't find any examples we have in the code.  I can't think of any particular cases, but
for instance, you wanted to know where the zk instance was running (say we wanted to get/store
some extra information there).

If we are going to provide a getConf() to the user, I don't see why it would be advantageous
to only Configuration vs ICGC in general.


- Avery


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10297/#review18711
-----------------------------------------------------------


On April 5, 2013, 7:24 a.m., Avery Ching wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10297/
> -----------------------------------------------------------
> 
> (Updated April 5, 2013, 7:24 a.m.)
> 
> 
> Review request for giraph.
> 
> 
> Description
> -------
> 
> All input/output formats and readers/writers now are ImmutableClassesGiraphConfigurable,
which means that we no longer need to create your own ImmutableClassesGiraphConfiguration
per format/reader/writer, which is a bad way of doing things.  This simplifies the code and
makes it a little easier for users to write their own formats/readers/writers.
> 
> 
> This addresses bug GIRAPH-564.
>     https://issues.apache.org/jira/browse/GIRAPH-564
> 
> 
> Diffs
> -----
> 
>   giraph-accumulo/src/main/java/org/apache/giraph/io/accumulo/AccumuloVertexInputFormat.java
28dc951a973630bf259f4a937d9c15d2cac2597a 
>   giraph-accumulo/src/main/java/org/apache/giraph/io/accumulo/AccumuloVertexOutputFormat.java
182afad8bf7c7dab4b1bf698b5965b1615e44bcb 
>   giraph-accumulo/src/test/java/org/apache/giraph/io/accumulo/edgemarker/AccumuloEdgeInputFormat.java
4e76b747112900e95c716a1484df065b2257e7bd 
>   giraph-accumulo/src/test/java/org/apache/giraph/io/accumulo/edgemarker/AccumuloEdgeOutputFormat.java
0937b8427dabe089ac800ce8977015fed69fc987 
>   giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWriter.java c1c66787ff7991737f4cf9b5f08e23ae8010ba85

>   giraph-core/src/main/java/org/apache/giraph/aggregators/TextAggregatorWriter.java ef4371404f264e7d11ea7e1041c8ed55d550b42c

>   giraph-core/src/main/java/org/apache/giraph/io/BasicVertexValueReader.java 1aed963f56ea00d4070f878f346181956c66fef6

>   giraph-core/src/main/java/org/apache/giraph/io/EdgeInputFormat.java 87ea5a0eeee5dd994857e8b181afeb8e9b066417

>   giraph-core/src/main/java/org/apache/giraph/io/EdgeReader.java f6dccdc5f63746fbe6f7d96706f8e311edc74808

>   giraph-core/src/main/java/org/apache/giraph/io/GiraphInputFormat.java 6b175a285553b1451320f8f89fe78e48e167988b

>   giraph-core/src/main/java/org/apache/giraph/io/ReverseEdgeDuplicator.java d6cb88612a3fde4043839da09f0b7d27809324bd

>   giraph-core/src/main/java/org/apache/giraph/io/VertexInputFormat.java 5aface5dce9f4c7d6979acee58a9585a89656bba

>   giraph-core/src/main/java/org/apache/giraph/io/VertexOutputFormat.java 270d0402373c8385db2d634d090276805689cfa3

>   giraph-core/src/main/java/org/apache/giraph/io/VertexReader.java ef4ded6a64f3e3337b9b50da325a14fd36f2b4c3

>   giraph-core/src/main/java/org/apache/giraph/io/VertexValueReader.java 25e609303234b363d08de0c4e62b5d4e620fab14

>   giraph-core/src/main/java/org/apache/giraph/io/VertexWriter.java 2aa3a71e416c27fb8bec1536cd303235526c86d8

>   giraph-core/src/main/java/org/apache/giraph/io/formats/AdjacencyListTextVertexInputFormat.java
7af4a71b34777d87bf5efd3bc0f1fa0278450fb4 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/AdjacencyListTextVertexOutputFormat.java
581540315c69ee93dc212d378004739b3269045a 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/IdWithValueTextOutputFormat.java
fe4a1d59c56da48bce8c95e1f693e1994a549a2e 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/IntNullReverseTextEdgeInputFormat.java
02703480d93ce7dc2dd3c35e1ff9140ba43da9cf 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomEdgeInputFormat.java
cd454e3c36235ac07b43e9700e511915bd8e4a47 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomVertexInputFormat.java
c261f728b434c5cd7619e5276bda9ecd976f9de4 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SequenceFileVertexInputFormat.java
3ffda1fd5f80421fd627b43f3b1e107334f7137d 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SequenceFileVertexOutputFormat.java
468e6bdebaad5d0ddadcf5d8abd72135b439b188 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeInputFormat.java 0aae894a69fd18d390d8f3a4fec5615232c38ef7

>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexInputFormat.java e47bda37d68481e8a0abe02f984df13758f1ca41

>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java
ad96cfe07d5ace11fce46f2af46745d09d1631db 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/EdgeReaderWrapper.java 4af221a22bdf98b079c403dcb4bd8f36b0ddc0f1

>   giraph-core/src/main/java/org/apache/giraph/io/iterables/VertexReaderWrapper.java 7e695ac99873dffc9c63edb8091870f5ceaa748f

>   giraph-core/src/main/java/org/apache/giraph/io/superstep_output/MultiThreadedSuperstepOutput.java
a09f915c294c82de73234db1699cf6480cde8530 
>   giraph-core/src/main/java/org/apache/giraph/io/superstep_output/SynchronizedSuperstepOutput.java
9617b24907f7f53b7f16819ab997116369c290f3 
>   giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInputFormat.java 8e56eac4b49caddb635966f8204cab9a5ae88d15

>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 2ea91b5d0f2d3d97d63a3ad96e04df5fa26d849b

>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 017257ad770f97c8541a80fb132190c39cb08138

>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java c18648ba0dfc19b6a33f311578de99b5252f36fd

>   giraph-core/src/test/java/org/apache/giraph/io/TestAdjacencyListTextVertexOutputFormat.java
60bfac1036abd9b8d29c965585ab3ace531dbf49 
>   giraph-core/src/test/java/org/apache/giraph/io/TestIdWithValueTextOutputFormat.java
3fb4bbec98a243d9fbf9bb6231b9c7010d63ef48 
>   giraph-core/src/test/java/org/apache/giraph/io/TestLongDoubleDoubleAdjacencyListVertexInputFormat.java
34140886afef90b21c1eb6d6236a72bed967b849 
>   giraph-core/src/test/java/org/apache/giraph/io/TestTextDoubleDoubleAdjacencyListVertexInputFormat.java
7aea9ce24a9bd3bdc3a4b34259761ccd82dfb989 
>   giraph-examples/src/main/java/org/apache/giraph/examples/GeneratedVertexReader.java
9e4cb83659ec4382a9c45179ef8e4e1c73acb49a 
>   giraph-examples/src/main/java/org/apache/giraph/examples/SimpleAggregatorWriter.java
188762105c15bfd54cfb34c40dad021ab0fbcc84 
>   giraph-examples/src/main/java/org/apache/giraph/examples/SimplePageRankVertex.java
73dc9a971be53e2695916f05e3f9d0175014fab3 
>   giraph-examples/src/main/java/org/apache/giraph/examples/SimpleSuperstepVertex.java
df0359a61c5f69f51cea28230ba89e311ec2df0b 
>   giraph-hbase/src/main/java/org/apache/giraph/io/hbase/HBaseVertexInputFormat.java 590ecb03315d7389a7cd87b34d704b8017b2b83f

>   giraph-hbase/src/main/java/org/apache/giraph/io/hbase/HBaseVertexOutputFormat.java
297139708e7006371a9e4a4d3accb8e58a14224d 
>   giraph-hbase/src/test/java/org/apache/giraph/io/hbase/TestHBaseRootMarkerVertextFormat.java
4bd5dcfba2d401c796b268485a089aaade2320ca 
>   giraph-hbase/src/test/java/org/apache/giraph/io/hbase/edgemarker/TableEdgeInputFormat.java
55e506b3794cf90748581251dd7f453baa1e853d 
>   giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatalogEdgeInputFormat.java
96ec77f6831cda0419cc1e2a3ab82234843abefb 
>   giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatalogVertexInputFormat.java
9dea18eaa3210f6a24bef5592acfcff289e975d8 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java
041e331c79f19bba5c44234256a4192edd791353 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 9093603dba77c567300e82f847b762f573256fdf

>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java
1f87257cba0fc0c5dd80f3f204da5eff8fb948f2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexReader.java
318e83fd19d8c9fc79234e948938f0ce58966e2e 
>   giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java
9ff74d633151b1a53a7cd16635032351e16f477a 
>   giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java 45d0226ddf8828c9587e6904009ffee6c045ad13

> 
> Diff: https://reviews.apache.org/r/10297/diff/
> 
> 
> Testing
> -------
> 
> mvn clean verify
> Ran pagerank benchmark on a real cluster.
> 
> 
> Thanks,
> 
> Avery Ching
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message