[ https://issues.apache.org/jira/browse/MAHOUT710?page=com.atlassian.jira.plugin.system.issuetabpanels:commenttabpanel&focusedCommentId=13050684#comment13050684
]
Tillmann Fiehn commented on MAHOUT710:

Hi Mahouters!
We worked on the features requested and are proud to present a first unit.
We implemented
* SimplifyGraph which can process a variaty of graphs
* AugmentGraphWithDegrees which augments a graph with degrees to increase stability for future
processings
* EnumerateTriangles which enumerates the triangles of a graph
This three jobs and a set of tests to proof functionality and usage a available from:
https://TillmannFiehn@github.com/TillmannFiehn/ktrusses.git
In there is a mvn project that builds against mahout0.5.
> Implementing KTrusses
> 
>
> Key: MAHOUT710
> URL: https://issues.apache.org/jira/browse/MAHOUT710
> Project: Mahout
> Issue Type: New Feature
> Affects Versions: 0.6
> Reporter: Tillmann Fiehn
> Assignee: Sebastian Schelter
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> We are Tillmann Fiehn and Sebastian Arnold, IT students from TU Berlin. As Sebastian
Schelter already announced, we are atteding Isabel's and Sebastian's class "Large scale data
analysis and data mining" and picked an interesting project that we want to implement in Mahout.
We are open for any hints and suggestions and would appreciate if you could share your thoughts
on our proposal.
> Our goal is to implement a map/reduce algorithm for finding ktrusses in a given graph.
A ktruss is a nontrivial, singlecomponent maximal subgraph, such that every edge is contained
in at least k2 triangles in the subgraph. The algorithm was proposed in the IEEE paper J.
Cohen 2009: "Graph Twiddling in a MapReduce World" (http://www.csee.usf.edu/~anda/CIS6930S11/papers/graphprocessingwmapreduce.pdf)
and involves a number of graph algorithms that are to our knowledge currently not present
in Mahout:
> *Goal: finding KTrusses*
> * relaxation of kmember clique
> * nontrivial, singlecomponent maximal subgraph, s.t.
> * every edge is contained in at least k2 triangles in the subgraph
> *Algorithms to be implemented on top of Mahout / Hadoop:*
> *simplifyGraph*: Edges > RepresentativeEdges
> * removes Loops (not cycles)
> * aggregate duplicate edges
> *augmentGraphWithDegrees*: RepresentativeEdges > AugmentedEdges = (Edge (v, u) ,
d(v), d(u))
> * augements the edges with degree information for both nodes d(v) = {E  E = (x,y) a.
(x = v o. y = v) }
> *enumerateTriangles*: AugmentedEdges = (Edge, d(v), d(u)) > Triangles (v, u, s)
> * finds all triangles in a Graph
> *findComponents*: RepresentativeEdges > ZoneAssignments (v, z)
> * finds all components of a graph, each identified as the order number of the lowestorder
vertex contained
> * consists of:
> ** step 1: find adjacent zones: Edges x Zones > InterzoneEdges (z, z)
> ** step 2: merge adjacent zones into one (the lowestorder neighbouring zone): InterzoneEdges,
ZoneAssignments (v, z) > Pairs (v, z)
> {noformat}
> while true do:
> step 1
> if empty set interzone edges break;
> step 2
> done
> {noformat}
> *findKTrusses*: Edges, k > ZoneAssignments (v, z)
> * finds all ktrusses of the graph
> * each returned vertex v is part of a truss z
> {noformat}
> simplifyGraph
> while true do:
> augmentGraphWithDegrees
> enumerateTriangles
> keep only edges contained in k2 triangles
> if all edges kept break;
> done
> findComponents
> {noformat}
> We suppose to create the package {{org.apache.mahout.graph.trusses}} and {{org.apache.mahout.graph.components}}
in the {{core}} module.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
