tinkerpop-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From spmalle...@apache.org
Subject tinkerpop git commit: Added Gremlin's Anatomy tutorial
Date Wed, 28 Feb 2018 19:16:02 GMT
Repository: tinkerpop
Updated Branches:
  refs/heads/tp32 1a857da8a -> 3aa9e70ef

Added Gremlin's Anatomy tutorial

I might add more to this, but wanted the basic component parts of Gremlin documented. Seemed
best to make this part of a standalone document as it didn't quite fit that well in the reference
documentation, as it already has a way of introducing those topics and I didn't want to disturb
that too much. CTR

Project: http://git-wip-us.apache.org/repos/asf/tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/tinkerpop/commit/3aa9e70e
Tree: http://git-wip-us.apache.org/repos/asf/tinkerpop/tree/3aa9e70e
Diff: http://git-wip-us.apache.org/repos/asf/tinkerpop/diff/3aa9e70e

Branch: refs/heads/tp32
Commit: 3aa9e70ef7e50d81886954e398b4355524f7b576
Parents: 1a857da
Author: Stephen Mallette <spmva@genoprime.com>
Authored: Wed Feb 28 14:14:01 2018 -0500
Committer: Stephen Mallette <spmva@genoprime.com>
Committed: Wed Feb 28 14:14:01 2018 -0500

 docs/src/index.asciidoc                         |   3 +
 .../tutorials/gremlins-anatomy/index.asciidoc   | 189 +++++++++++++++++++
 docs/static/images/gremlin-anatomy-filter.png   | Bin 0 -> 168854 bytes
 docs/static/images/gremlin-anatomy-group.png    | Bin 0 -> 62410 bytes
 docs/static/images/gremlin-anatomy-navigate.png | Bin 0 -> 60514 bytes
 docs/static/images/gremlin-anatomy.png          | Bin 0 -> 87212 bytes
 pom.xml                                         |  23 +++
 7 files changed, 215 insertions(+)

diff --git a/docs/src/index.asciidoc b/docs/src/index.asciidoc
index 40fbb8c..5cc3dd5 100644
--- a/docs/src/index.asciidoc
+++ b/docs/src/index.asciidoc
@@ -57,6 +57,8 @@ Note the "+" following the link in each table entry - it forces an asciidoc
 A gentle introduction to TinkerPop and the Gremlin traversal language that is divided into
five, ten and fifteen minute tutorial blocks.
 |image:gremlin-dashboard.png[] |link:http://tinkerpop.apache.org/docs/x.y.z/tutorials/the-gremlin-console/[The
Gremlin Console] +
 Provides a detailed look at The Gremlin Console and how it can be used when working with
+^|image:gremlin-anatomy.png[width=125] |link:http://tinkerpop.apache.org/docs/x.y.z/gremlins-anatomy/[Gremlin's
+Identifies and explains the component parts of a Gremlin traversal.
 ^|image:gremlin-chef.png[width=125] |link:http://tinkerpop.apache.org/docs/x.y.z/recipes/[Gremlin
 A collection of best practices and common traversal patterns for Gremlin.
 ^|image:gremlin-house-of-mirrors-cropped.png[width=200] |link:http://tinkerpop.apache.org/docs/x.y.z/tutorials/gremlin-language-variants/[Gremlin
Language Variants]
@@ -77,6 +79,7 @@ A getting started guide for users of graph databases and the Gremlin query
 Unless otherwise noted, all "publications" are externally managed:
+* Mallette, S.P., link:https://www.slideshare.net/StephenMallette/gremlins-anatomy-88713465["Gremlin's
Anatomy,"] DataStax User Group, February 2018.
 * Rodriguez, M.A., link:https://www.slideshare.net/slidarko/gremlin-1013-on-your-fm-dial["Gremlin
101.3 On Your FM Dial,"] DataStax Support and Engineering Summits, Carmel California and Las
Vegas Nevada, May 2017.
 * Rodriguez, M.A., link:https://www.datastax.com/2017/03/graphoendodonticology["Graphoendodonticology,"]
DataStax Engineering Blog, March 2017
 * Rodriguez, M.A., link:http://www.datastax.com/dev/blog/gremlins-time-machine["Gremlin's
Time Machine,"] DataStax Engineering Blog, September 2016.

diff --git a/docs/src/tutorials/gremlins-anatomy/index.asciidoc b/docs/src/tutorials/gremlins-anatomy/index.asciidoc
new file mode 100644
index 0000000..b36d881
--- /dev/null
+++ b/docs/src/tutorials/gremlins-anatomy/index.asciidoc
@@ -0,0 +1,189 @@
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+See the License for the specific language governing permissions and
+limitations under the License.
+== Gremlin's Anatomy
+image:gremlin-anatomy.png[width=160,float=left]The Gremlin language is typically described
by the individual
+link:http://tinkerpop.apache.org/docs/x.y.z/reference/#graph-traversal-steps[steps] that
make up the language, but it
+is worth taking a look at the component parts of Gremlin that make a traversal work. Understanding
these component
+parts make it possible to discuss and understand more advanced Gremlin topics, such as
+link:http://tinkerpop.apache.org/docs/x.y.z/reference/#dsl[Gremlin DSL] development and Gremlin
debugging techniques.
+Ultimately, Gremlin's Anatomy provides a foundational understanding for helping to read and
follow Gremlin of arbitrary
+complexity, which will lead you to more easily identify traversal patterns and thus enable
you to craft better
+traversals of your own.
+NOTE: This tutorial is based on Stephen Mallette's presentation on Gremlin's Anatomy - the
slides for that presentation
+can be found link:https://www.slideshare.net/StephenMallette/gremlins-anatomy-88713465[here].
+The component parts of a Gremlin traversal can be all be identified from the following code:
+  has('person', 'name', within('marko', 'josh')).
+  outE().
+  groupCount().
+    by(label()).next()
+In plain English, this traversal requests an out-edge label distribution for "marko" and
"josh". The following
+sections, will pick this traversal apart to show each component part and discuss it in some
+=== GraphTraversalSource
+_`g.V()`_ - You are likely well acquainted with this bit of Gremlin. It is in virtually every
traversal you read in
+documentation, blog posts, or examples and is likely the start of most every traversal you
will write in your own
+While it is well known that `g.V()` returns a list of all the vertices in the graph, the
technical underpinnings of
+this ubiquitous statement may be less so well established. First of all, the `g` is a variable.
It could have been
+`x`, `y` or anything else, but by convention, you will normally see `g`. This `g` is a `GraphTraversalSource`
+and it spawns `GraphTraversal` instances with start steps. `V()` is one such start step,
but there are others like
+`E` for getting all the edges in the graph. The important part is that these start steps
begin the traversal.
+In addition to exposing the available start steps, the `GraphTraversalSource` also holds
configuration options (perhaps
+think of them as pre-instructions for Gremlin) to be used for the traversal execution. The
methods that allow you to
+set these configurations are prefixed by the word "with". Here are a few examples to consider:
+g.withStrategies(SubgraphStrategy.build().vertices(hasLabel('person')).create()).  <1>
+  V().has('name','marko').out().values('name')
+g.withSack(1.0f).V().sack()                                                        <2>
+g.withComputer().V().pageRank()                                                    <3>
+<1> Define a link:http://tinkerpop.apache.org/docs/x.y.z/reference/#traversalstrategy[strategy]
for the traversal
+<2> Define an initial link:http://tinkerpop.apache.org/docs/x.y.z/reference/#sack-step[sack]
+<3> Define a link:http://tinkerpop.apache.org/docs/x.y.z/reference/#graphcomputer[GraphComputer]
to use in conjunction
+with a `VertexProgram` for OLAP based traversals - for example, see
+IMPORTANT: How you instantiate the `GraphTraversalSource` is highly depending on the graph
database implementation that
+you are using. Typically, they are instantiated from a `Graph` instance with the `traversal()`
method, but some graph
+databases, ones that are managed or "server-oriented", will simply give you a `g` to work
with. Consult the
+documentation of your graph database to determine how the `GraphTraversalSource` is constructed.
+=== GraphTraversal
+As you now know, a `GraphTraversal` is spawned from the start steps of a `GraphTraversalSource`.
The `GraphTraversal`
+contain the steps that make up the Gremlin language. Each step returns a `GraphTraversal`
so that the steps can be
+chained together in a fluent fashion. Revisiting the example from above:
+  has('person', 'name', within('marko', 'josh')).
+  outE().
+  groupCount().
+    by(label()).next()
+the `GraphTraversal` components are represented by the `has()`, `outE()` and `groupCount()`
steps. The key to reading
+this Gremlin is to realize that the output of one step becomes the input to the next. Therefore,
if you consider the
+start step of `V()` and realize that it returns vertices in the graph, the input to `has()`
is going to be a `Vertex`.
+The `has()` step is a filtering step and will take the vertices that are passed into it and
block any that do not
+meet the criteria it has specified. In this case, that means that the output of the `has()`
step is vertices that have
+the label of "person" and the "name" property value of "josh" or "marko". 
+Given that you know the output of `has()`, you then also know the input to `outE()`. Recall
that `outE()` is a
+navigational step in that it enables movement about the graph. In this case, `outE()` tells
Gremlin to take the
+incoming "marko" and "josh" vertices and traverse their outgoing edges as the output.
+Now that it is clear that the output of `outE()` is an edge, you are aware of the input to
`groupCount()` - edges.
+The `groupCount()` step requires a bit more discussion of other Gremlin components and will
thus be examined in the
+following sections. At this point, it is simply worth noting that the output of `groupCount()`
is a `Map` and if a
+Gremlin step followed it, the input to that step would therefore be a `Map`.
+The previous paragraph ended with an interesting point, in that it implied that there were
no "steps" following
+`groupCount()`. Clearly, `groupCount()` is not the last function to be called in that Gremlin
statement so you might
+wonder what the remaining bits are, specifically: `by(label()).next()`. The following sections
will discuss those
+remaining pieces.
+=== Step Modulators
+It's been explained in several ways now that the output of one step becomes the input to
the next, so surely the `Map`
+produced by `groupCount()` will feed the `by()` step. As alluded to at the end of the previous
section, that
+expectation is not correct. Technically, `by()` is not a step. It is a step modulator. A
step modulator modifies the
+behavior of the previous step. In this case, it is telling Gremlin how the key for the `groupCount()`
should be
+determined. Or said another way in the context of the example, it answers this question:
What do you want the "marko"
+and "josh" edges to be grouped by?
+=== Anonymous Traversals
+In this case, the answer to that question is provided by the anonymous traversal `label()`
as the argument to the step
+modulator `by()`. An anonymous traversal is a traversal that is not bound to a `GraphTraversalSource`.
It is
+constructed from the double underscore class (i.e. `__`), which exposes static functions
to spawn the anonymous
+traversals. Typically, the double underscore is not visible in examples and code as by convention,
TinkerPop typically
+recommends that the functions of that class be exposed in a standalone fashion. In Java,
that would mean
importing] the
+methods, thus allowing `__.label()` to be referred to simply as `label()`.
+NOTE: In Java, the full package name for the `__` is `org.apache.tinkerpop.gremlin.process.traversal.dsl.graph`.
+In the context of the example traversal, you can imagine Gremlin getting to the `groupCount()`
step with a "marko" or
+"josh" outgoing edge, checking the `by()` modulator to see "what to group by", and then putting
edges into buckets
+by their `label()` and incrementing a counter on each bucket.
+The output is thus an edge label distribution for the outgoing edges of the "marko" and "josh"
+=== Terminal Step
+Terminal steps are different from the `GraphTraversal` steps in that terminal steps do not
return a `GraphTraversal`
+instance, but instead return the result of the `GraphTraversal`. In the case of the example,
`next()` is the terminal
+step and it returns the `Map` constructed in the `groupCount()` step. Other examples of terminal
steps include:
+`hasNext()`, `toList()`, and `iterate()`. Without terminal steps, you don't have a result.
You only have a
+NOTE: You can read more about traversal iteration in the
Console Tutorial].
+=== Expressions
+It is worth backing up a moment to re-examine the `has()` step. Now that you have come to
understand anonymous
+traversals, it would be reasonable to make the assumption that the `within()` argument to
`has()` falls into that
+category. It does not. The `within()` option is not a step either, but instead, something
called an expression. An
+expression typically refers to anything not mentioned in the previously described Gremlin
component categories that
+can make Gremlin easier to read, write and maintain. Common examples of expressions would
be string tokens, enum
+values, and classes with static methods that might spawn certain required values.
+A concrete example would be the class from which `within()` is called - `P`. The `P` class
spawns `Predicate` values
+that can be used as arguments for certain traversals teps. Another example would be the `T`
enum which provides a type
+safe way to reference `id` and `label` keys in a traversal. Like anonymous traversals, these
classes are usually
+statically imported so that instead of having to write `P.within()`, you can simply write
`within()`, as shown in the
+== Conclusion
+There's much more to a traversal than just a bunch of steps. Gremlin's Anatomy puts names
to each of these component
+parts of a traversal and explains how they connect together. Understanding these components
part should help provide
+more insight into how Gremlin works and help you grow in your Gremlin abilities.
\ No newline at end of file

diff --git a/docs/static/images/gremlin-anatomy-filter.png b/docs/static/images/gremlin-anatomy-filter.png
new file mode 100755
index 0000000..317bb8c
Binary files /dev/null and b/docs/static/images/gremlin-anatomy-filter.png differ

diff --git a/docs/static/images/gremlin-anatomy-group.png b/docs/static/images/gremlin-anatomy-group.png
new file mode 100755
index 0000000..0039c02
Binary files /dev/null and b/docs/static/images/gremlin-anatomy-group.png differ

diff --git a/docs/static/images/gremlin-anatomy-navigate.png b/docs/static/images/gremlin-anatomy-navigate.png
new file mode 100755
index 0000000..152e40e
Binary files /dev/null and b/docs/static/images/gremlin-anatomy-navigate.png differ

diff --git a/docs/static/images/gremlin-anatomy.png b/docs/static/images/gremlin-anatomy.png
new file mode 100755
index 0000000..d83ebf7
Binary files /dev/null and b/docs/static/images/gremlin-anatomy.png differ

diff --git a/pom.xml b/pom.xml
index 259f16a..f6ff536 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1036,6 +1036,29 @@ limitations under the License.
+                            <execution>
+                                <id>tutorial-gremlins-anatomy</id>
+                                <phase>generate-resources</phase>
+                                <goals>
+                                    <goal>process-asciidoc</goal>
+                                </goals>
+                                <configuration>
+                                    <sourceDirectory>${asciidoc.input.dir}/tutorials/gremlins-anatomy</sourceDirectory>
+                                    <sourceDocumentName>index.asciidoc</sourceDocumentName>
+                                    <outputDirectory>${htmlsingle.output.dir}/tutorials/gremlins-anatomy
+                                    </outputDirectory>
+                                    <backend>html5</backend>
+                                    <doctype>article</doctype>
+                                    <attributes>
+                                        <imagesdir>../../images</imagesdir>
+                                        <encoding>UTF-8</encoding>
+                                        <stylesdir>${asciidoctor.style.dir}</stylesdir>
+                                        <stylesheet>tinkerpop.css</stylesheet>
+                                        <source-highlighter>coderay</source-highlighter>
+                                        <basedir>${project.basedir}</basedir>
+                                    </attributes>
+                                </configuration>
+                            </execution>

View raw message