tinkerpop-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From spmalle...@apache.org
Subject [38/50] tinkerpop git commit: Merge branch 'tp31' into tp32
Date Thu, 06 Apr 2017 11:46:28 GMT
Merge branch 'tp31' into tp32

Project: http://git-wip-us.apache.org/repos/asf/tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/tinkerpop/commit/aa5c4eea
Tree: http://git-wip-us.apache.org/repos/asf/tinkerpop/tree/aa5c4eea
Diff: http://git-wip-us.apache.org/repos/asf/tinkerpop/diff/aa5c4eea

Branch: refs/heads/TINKERPOP-1577
Commit: aa5c4eead0e2ac0047bfd64832c29bd5391d9fb1
Parents: 0fd2666 b831ae8
Author: Robert Dale <robdale@gmail.com>
Authored: Wed Mar 29 11:43:01 2017 -0400
Committer: Robert Dale <robdale@gmail.com>
Committed: Wed Mar 29 11:43:01 2017 -0400

 docs/src/reference/the-graphcomputer.asciidoc | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --cc docs/src/reference/the-graphcomputer.asciidoc
index f449668,fb4331a..e4dff50
--- a/docs/src/reference/the-graphcomputer.asciidoc
+++ b/docs/src/reference/the-graphcomputer.asciidoc
@@@ -520,44 -474,18 +520,44 @@@ only comes into play with custom steps 
  . When evaluating traversals that rely on path information (i.e. the history of the traversal),
  computational limits can easily be reached due the link:http://en.wikipedia.org/wiki/Combinatorial_explosion[combinatoric
  of data. With path computing enabled, every traverser is unique and thus, must be enumerated
as opposed to being
 -counted/merged. The difference being a collection of paths vs. a single 64-bit long at a
single vertex. For more
 +counted/merged. The difference being a collection of paths vs. a single 64-bit long at a
single vertex. In other words,
 + bulking is very unlikely with traversers that maintain path information. For more
- information on this concept, please see link:http://thinkaurelius.com/2012/11/11/faunus-provides-big-graph-data-analytics/[Faunus
Provides Big Graph Data].
+ information on this concept, please see link:https://thinkaurelius.wordpress.com/2012/11/11/faunus-provides-big-graph-data-analytics/[Faunus
Provides Big Graph Data].
 -. When traversals of the form `x.as('a').y.someSideEffectStep('a').z` are evaluated, the
`a` object is stored in the
 -path information of the traverser and thus, such traversals (may) turn on path calculations
when executed on a
  . Steps that are concerned with the global ordering of traversers do not have a meaningful
representation in
  OLAP. For example, what does <<order-step,`order()`>>-step mean when all traversers
are being processed in parallel?
  Even if the traversers were aggregated and ordered, then at the next step they would return
to being executed in
  parallel and thus, in an unpredictable order. When `order()`-like steps are executed at
the end of a traversal (i.e
 -the final step), the `TraverserMapReduce` job ensures the resultant serial representation
is ordered accordingly.
 -. Steps that are concerned with providing a global aggregate to the next step of computation
do not have a correlate
 -in OLAP. For example, <<fold-step,`fold()`>>-step can only fold up the objects
at each executing vertex. Next, even
 -if a global fold was possible, where would it go? Which vertex would be the host of the
data structure? The
 -`fold()`-step only makes sense as an end-step whereby a MapReduce job can generate the proper
global-to-local data
 +the final step), `TraversalVertexProgram` ensures a serial representation is ordered accordingly.
Moreover, it is intelligent enough
 +to maintain the ordering of `g.V().hasLabel("person").order().by("age").values("name")`.
However, the OLAP traversal
 +`g.V().hasLabel("person").order().by("age").out().values("name")` will lose the original
ordering as the `out()`-step
 +will rebroadcast traversers across the cluster.
 +Graph Filter
 +Most OLAP jobs do not require the entire source graph to faithfully execute their `VertexProgram`.
For instance, if
 +`PageRankVertexProgram` is only going to compute the centrality of people in the friendship-graph,
then the following
 +`GraphFilter` can be applied.
 +  vertices(hasLabel("person")).
 +  edges(bothE("knows")).
 +  program(PageRankVertexProgram...)
 +There are two methods for constructing a `GraphFilter`.
 +* `vertices(Traversal<Vertex,Vertex>)`: A traversal that will be used that can only
analyze a vertex and its properties.
 +If the traversal `hasNext()`, the input `Vertex` is passed to the `GraphComputer`.
 +* `edges(Traversal<Vertex,Edge>)`: A traversal that will iterate all legal edges for
the source vertex.
 +`GraphFilter` is a "push-down predicate" that providers can reason on to determine the most
efficient way to provide
 +graph data to the `GraphComputer`.
 +IMPORTANT: Apache TinkerPop provides `GraphFilterStrategy` <<traversalstrategy,traversal
strategy>> which analyzes a submitted
 +OLAP traversal and, if possible, creates an appropriate `GraphFilter` automatically. For
instance, `g.V().count()` would
 +yield a `GraphFilter.edges(limit(0))`. Thus, for traversal submissions, users typically
do not need to be aware of creating
- graph filters explicitly. Users can use the <<explain-step,`explain()`>>-step
to see the `GraphFilter` generated by `GraphFilterStrategy`.
++graph filters explicitly. Users can use the <<explain-step,`explain()`>>-step
to see the `GraphFilter` generated by `GraphFilterStrategy`.

View raw message