giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmet Emre Aladag" <emre.ala...@agmlab.com>
Subject Review Request 13492: LinkRank implementation with Giraph
Date Tue, 13 Aug 2013 08:30:31 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13492/
-----------------------------------------------------------

Review request for giraph.


Bugs: GIRAPH-729
    https://issues.apache.org/jira/browse/GIRAPH-729


Repository: giraph-git


Description
-------

Currently, Nutch 2.x lacks LinkRank (a variant of PageRank). Adding a module for Nutch including
LinkRank and other possible ranking algorithms would be useful for Apache Community. This
module can be used by Nutch 1.x and other apps as well.

Attached you can find my patch. It includes:

* I/O formats (URL Text-URL Text edges, URL Text nodes) for reading from HDFS and HBase, 
* Self-link and duplicate-link elimination
* LinkRank computation (10 iterations by default).
* Cumulative distribution normalization


Diffs
-----

  giraph-nutch/pom.xml PRE-CREATION 
  giraph-nutch/src/main/assembly/compile.xml PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankComputation.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertex.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexFilter.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexMasterCompute.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexWorkerContext.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/LinkRankEdgeFilter.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/package-info.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankEdgeInputFormat.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexInputFormat.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexOutputFormat.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexUniformInputFormat.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2HostInputFormat.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2WebpageInputFormat.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2WebpageOutputFormat.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/package-info.java
PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/package-info.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/package-info.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/package-info.java PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/NutchUtil.java PRE-CREATION 
  giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringDoublePair.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringFloatPair.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringStringPair.java PRE-CREATION

  giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/package-info.java PRE-CREATION

  giraph-nutch/src/test/java/org/apache/giraph/nutch/LinkRankComputationTest.java PRE-CREATION

  giraph-nutch/src/test/java/org/apache/giraph/nutch/LinkRankHBaseTest.java PRE-CREATION 
  giraph-nutch/src/test/java/org/apache/giraph/nutch/package-info.java PRE-CREATION 
  pom.xml 41b6bb1 

Diff: https://reviews.apache.org/r/13492/diff/


Testing
-------

* Unittests for computation on HDFS and HBase.


Thanks,

Ahmet Emre Aladag


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message