giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo" <majakabi...@fb.com>
Subject Review Request: GIRAPH-648: Allow IO formats to add parameters to Configuration
Date Sun, 21 Apr 2013 17:40:26 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10690/
-----------------------------------------------------------

Review request for giraph.


Description
-------

Currently we heavily rely on some runners (HCatGiraphRunner and HiveGiraphRunner) to prepare
Configuration before application starts, and we have no way of using hcat/hive io without
these runners. It would be better and more flexible if io formats would add what's needed
for underlying io to Configuration themselves.

Unfortunately this is not as straightforward as it sounds, because methods from io formats,
readers/writers/OutputCommitter have JobContext or TaskAttemptContext as an argument, and
in some cases those hold the copy of Configuration, not the original. So I added a way to
track which parameters where added to GiraphConfiguration, and wrapped all io related calls
to append those parameters to JobContext/TaskAttemptContext before passing control to actual
io formats.

Cleaned up HiveGiraphRunner and moved all control to its io formats, I can do similar for
HCatalog in a separate patch.

This will also help us do GIRAPH-639 in a cleaner way, and it will actually be possible to
mix different kind of input formats (hcat, hive, hbase, or whatever).


This addresses bug GIRAPH-648.
    https://issues.apache.org/jira/browse/GIRAPH-648


Diffs
-----

  giraph-core/src/main/java/org/apache/giraph/bsp/BspOutputFormat.java 574895c 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e 
  giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java
8dfe546 
  giraph-core/src/main/java/org/apache/giraph/io/EdgeInputFormat.java 43cc7be 
  giraph-core/src/main/java/org/apache/giraph/io/VertexInputFormat.java b3f234f 
  giraph-core/src/main/java/org/apache/giraph/io/VertexOutputFormat.java 71eb665 
  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeInputFormat.java PRE-CREATION

  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedVertexInputFormat.java PRE-CREATION

  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedVertexOutputFormat.java PRE-CREATION

  giraph-core/src/main/java/org/apache/giraph/io/internal/package-info.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/superstep_output/MultiThreadedSuperstepOutput.java
af086e1 
  giraph-core/src/main/java/org/apache/giraph/io/superstep_output/SynchronizedSuperstepOutput.java
2a7af29 
  giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc 
  giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b

  giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java c426032

  giraph-examples/src/test/java/org/apache/giraph/TestBspBasic.java e034b2f 
  giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f 
  giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1

  giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443 
  giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0

  giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java
097aeef 
  giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java 45c9ca3

  giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java 0215428 

Diff: https://reviews.apache.org/r/10690/diff/


Testing
-------

mvn clean verify
Real application run with hive io


Thanks,

Maja Kabiljo


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message