Return-Path: X-Original-To: apmail-tinkerpop-commits-archive@minotaur.apache.org Delivered-To: apmail-tinkerpop-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C9A4618AB3 for ; Mon, 22 Feb 2016 17:04:22 +0000 (UTC) Received: (qmail 49156 invoked by uid 500); 22 Feb 2016 17:04:22 -0000 Delivered-To: apmail-tinkerpop-commits-archive@tinkerpop.apache.org Received: (qmail 49130 invoked by uid 500); 22 Feb 2016 17:04:22 -0000 Mailing-List: contact commits-help@tinkerpop.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tinkerpop.incubator.apache.org Delivered-To: mailing list commits@tinkerpop.incubator.apache.org Received: (qmail 49121 invoked by uid 99); 22 Feb 2016 17:04:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Feb 2016 17:04:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 3AFCD18238F for ; Mon, 22 Feb 2016 17:04:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.549 X-Spam-Level: X-Spam-Status: No, score=-3.549 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.329] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 7ghYjrqWuidQ for ; Mon, 22 Feb 2016 17:04:16 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with SMTP id 12C5A5FAEC for ; Mon, 22 Feb 2016 17:04:15 +0000 (UTC) Received: (qmail 48454 invoked by uid 99); 22 Feb 2016 17:04:15 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Feb 2016 17:04:15 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 7C817E03C2; Mon, 22 Feb 2016 17:04:15 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: spmallette@apache.org To: commits@tinkerpop.incubator.apache.org Date: Mon, 22 Feb 2016 17:04:15 -0000 Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: [1/2] incubator-tinkerpop git commit: Extracted "provider docs" to their own book. Repository: incubator-tinkerpop Updated Branches: refs/heads/TINKERPOP-937 [created] 052681929 http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/05268192/docs/src/reference/implementations-intro.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-intro.asciidoc b/docs/src/reference/implementations-intro.asciidoc index 5357ce3..2cf4803 100644 --- a/docs/src/reference/implementations-intro.asciidoc +++ b/docs/src/reference/implementations-intro.asciidoc @@ -20,526 +20,8 @@ Implementations image::gremlin-racecar.png[width=325] -[[graph-system-provider-requirements]] -Graph System Provider Requirements ----------------------------------- - -image:tinkerpop-enabled.png[width=140,float=left] At the core of TinkerPop3 is a Java8 API. The implementation of this -core API and its validation via the `gremlin-test` suite is all that is required of a graph system provider wishing to -provide a TinkerPop3-enabled graph engine. Once a graph system has a valid implementation, then all the applications -provided by TinkerPop (e.g. Gremlin Console, Gremlin Server, etc.) and 3rd-party developers (e.g. Gremlin-Scala, -Gremlin-JS, etc.) will integrate properly. Finally, please feel free to use the logo on the left to promote your -TinkerPop3 implementation. - -Implementing Gremlin-Core -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The classes that a graph system provider should focus on implementing are itemized below. It is a good idea to study -the <> (in-memory OLTP and OLAP in `tinkergraph-gremlin`), <> -(OTLP w/ transactions in `neo4j-gremlin`) and/or <> (OLAP in `hadoop-gremlin`) -implementations for ideas and patterns. - -. Online Transactional Processing Graph Systems (*OLTP*) - .. Structure API: `Graph`, `Element`, `Vertex`, `Edge`, `Property` and `Transaction` (if transactions are supported). - .. Process API: `TraversalStrategy` instances for optimizing Gremlin traversals to the provider's graph system (i.e. `TinkerGraphStepStrategy`). -. Online Analytics Processing Graph Systems (*OLAP*) - .. Everything required of OTLP is required of OLAP (but not vice versa). - .. GraphComputer API: `GraphComputer`, `Messenger`, `Memory`. - -Please consider the following implementation notes: - -* Be sure your `Graph` implementation is named as `XXXGraph` (e.g. TinkerGraph, Neo4jGraph, HadoopGraph, etc.). -* Use `StringHelper` to ensuring that the `toString()` representation of classes are consistent with other implementations. -* Ensure that your implementation's `Features` (Graph, Vertex, etc.) are correct so that test cases handle particulars accordingly. -* Use the numerous static method helper classes such as `ElementHelper`, `GraphComputerHelper`, `VertexProgramHelper`, etc. -* There are a number of default methods on the provided interfaces that are semantically correct. However, if they are -not efficient for the implementation, override them. -* Implement the `structure/` package interfaces first and then, if desired, interfaces in the `process/` package interfaces. -* `ComputerGraph` is a `Wrapper` system that ensure proper semantics during a GraphComputer computation. - -[[oltp-implementations]] -OLTP Implementations -^^^^^^^^^^^^^^^^^^^^ - -image:pipes-character-1.png[width=110,float=right] The most important interfaces to implement are in the `structure/` -package. These include interfaces like Graph, Vertex, Edge, Property, Transaction, etc. The `StructureStandardSuite` -will ensure that the semantics of the methods implemented are correct. Moreover, there are numerous `Exceptions` -classes with static exceptions that should be thrown by the graph system so that all the exceptions and their -messages are consistent amongst all TinkerPop3 implementations. - -[[olap-implementations]] -OLAP Implementations -^^^^^^^^^^^^^^^^^^^^ - -image:furnace-character-1.png[width=110,float=right] Implementing the OLAP interfaces may be a bit more complicated. -Note that before OLAP interfaces are implemented, it is necessary for the OLTP interfaces to be, at minimal, -implemented as specified in <>. A summary of each required interface -implementation is presented below: - -. `GraphComputer`: A fluent builder for specifying an isolation level, a VertexProgram, and any number of MapReduce jobs to be submitted. -. `Memory`: A global blackboard for ANDing, ORing, INCRing, and SETing values for specified keys. -. `Messenger`: The system that collects and distributes messages being propagated by vertices executing the VertexProgram application. -. `MapReduce.MapEmitter`: The system that collects key/value pairs being emitted by the MapReduce applications map-phase. -. `MapReduce.ReduceEmitter`: The system that collects key/value pairs being emitted by the MapReduce applications combine- and reduce-phases. - -NOTE: The VertexProgram and MapReduce interfaces in the `process/computer/` package are not required by the graph -system. Instead, these are interfaces to be implemented by application developers writing VertexPrograms and MapReduce jobs. - -IMPORTANT: TinkerPop3 provides three OLAP implementations: <> (TinkerGraph), -<> (HadoopGraph), and <> (Hadoop). -Given the complexity of the OLAP system, it is good to study and copy many of the patterns used in these reference -implementations. - -Implementing GraphComputer -++++++++++++++++++++++++++ - -image:furnace-character-3.png[width=150,float=right] The most complex method in GraphComputer is the `submit()`-method. The method must do the following: - -. Ensure the the GraphComputer has not already been executed. -. Ensure that at least there is a VertexProgram or 1 MapReduce job. -. If there is a VertexProgram, validate that it can execute on the GraphComputer given the respectively defined features. -. Create the Memory to be used for the computation. -. Execute the VertexProgram.setup() method once and only once. -. Execute the VertexProgram.execute() method for each vertex. -. Execute the VertexProgram.terminate() method once and if true, repeat VertexProgram.execute(). -. When VertexProgram.terminate() returns true, move to MapReduce job execution. -. MapReduce jobs are not required to be executed in any specified order. -. For each Vertex, execute MapReduce.map(). Then (if defined) execute MapReduce.combine() and MapReduce.reduce(). -. Update Memory with runtime information. -. Construct a new `ComputerResult` containing the compute Graph and Memory. - -Implementing Memory -+++++++++++++++++++ - -image:gremlin-brain.png[width=175,float=left] The Memory object is initially defined by `VertexProgram.setup()`. -The memory data is available in the first round of the `VertexProgram.execute()` method. Each Vertex, when executing -the VertexProgram, can update the Memory in its round. However, the update is not seen by the other vertices until -the next round. At the end of the first round, all the updates are aggregated and the new memory data is available -on the second round. This process repeats until the VertexProgram terminates. - -Implementing Messenger -++++++++++++++++++++++ - -The Messenger object is similar to the Memory object in that a vertex can read and write to the Messenger. However, -the data it reads are the messages sent to the vertex in the previous step and the data it writes are the messages -that will be readable by the receiving vertices in the subsequent round. - -Implementing MapReduce Emitters -+++++++++++++++++++++++++++++++ - -image:hadoop-logo-notext.png[width=150,float=left] The MapReduce framework in TinkerPop3 is similar to the model -popularized by link:http://apache.hadoop.org[Hadoop]. The primary difference is that all Mappers process the vertices -of the graph, not an arbitrary key/value pair. However, the vertices' edges can not be accessed -- only their -properties. This greatly reduces the amount of data needed to be pushed through the MapReduce engine as any edge -information required, can be computed in the VertexProgram.execute() method. Moreover, at this stage, vertices can -not be mutated, only their token and property data read. A Gremlin OLAP system needs to provide implementations for -to particular classes: `MapReduce.MapEmitter` and `MapReduce.ReduceEmitter`. TinkerGraph's implementation is provided -below which demonstrates the simplicity of the algorithm (especially when the data is all within the same JVM). - -[source,java] ----- -public class TinkerMapEmitter implements MapReduce.MapEmitter { - - public Map> reduceMap; - public Queue> mapQueue; - private final boolean doReduce; - - public TinkerMapEmitter(final boolean doReduce) { <1> - this.doReduce = doReduce; - if (this.doReduce) - this.reduceMap = new ConcurrentHashMap<>(); - else - this.mapQueue = new ConcurrentLinkedQueue<>(); - } - - @Override - public void emit(K key, V value) { - if (this.doReduce) - this.reduceMap.computeIfAbsent(key, k -> new ConcurrentLinkedQueue<>()).add(value); <2> - else - this.mapQueue.add(new KeyValue<>(key, value)); <3> - } - - protected void complete(final MapReduce mapReduce) { - if (!this.doReduce && mapReduce.getMapKeySort().isPresent()) { <4> - final Comparator comparator = mapReduce.getMapKeySort().get(); - final List> list = new ArrayList<>(this.mapQueue); - Collections.sort(list, Comparator.comparing(KeyValue::getKey, comparator)); - this.mapQueue.clear(); - this.mapQueue.addAll(list); - } else if (mapReduce.getMapKeySort().isPresent()) { - final Comparator comparator = mapReduce.getMapKeySort().get(); - final List>> list = new ArrayList<>(); - list.addAll(this.reduceMap.entrySet()); - Collections.sort(list, Comparator.comparing(Map.Entry::getKey, comparator)); - this.reduceMap = new LinkedHashMap<>(); - list.forEach(entry -> this.reduceMap.put(entry.getKey(), entry.getValue())); - } - } -} ----- - -<1> If the MapReduce job has a reduce, then use one data structure (`reduceMap`), else use another (`mapList`). The -difference being that a reduction requires a grouping by key and therefore, the `Map>` definition. If no -reduction/grouping is required, then a simple `Queue>` can be leveraged. -<2> If reduce is to follow, then increment the Map with a new value for the key. `MapHelper` is a TinkerPop3 class -with static methods for adding data to a Map. -<3> If no reduce is to follow, then simply append a KeyValue to the queue. -<4> When the map phase is complete, any map-result sorting required can be executed at this point. - -[source,java] ----- -public class TinkerReduceEmitter implements MapReduce.ReduceEmitter { - - protected Queue> reduceQueue = new ConcurrentLinkedQueue<>(); - - @Override - public void emit(final OK key, final OV value) { - this.reduceQueue.add(new KeyValue<>(key, value)); - } - - protected void complete(final MapReduce mapReduce) { - if (mapReduce.getReduceKeySort().isPresent()) { - final Comparator comparator = mapReduce.getReduceKeySort().get(); - final List> list = new ArrayList<>(this.reduceQueue); - Collections.sort(list, Comparator.comparing(KeyValue::getKey, comparator)); - this.reduceQueue.clear(); - this.reduceQueue.addAll(list); - } - } -} ----- - -The method `MapReduce.reduce()` is defined as: - -[source,java] -public void reduce(final OK key, final Iterator values, final ReduceEmitter emitter) { ... } - -In other words, for the TinkerGraph implementation, iterate through the entrySet of the `reduceMap` and call the -`reduce()` method on each entry. The `reduce()` method can emit key/value pairs which are simply aggregated into a -`Queue>` in an analogous fashion to `TinkerMapEmitter` when no reduce is to follow. These two emitters -are tied together in `TinkerGraphComputer.submit()`. - -[source,java] ----- -... -for (final MapReduce mapReduce : mapReducers) { - if (mapReduce.doStage(MapReduce.Stage.MAP)) { - final TinkerMapEmitter mapEmitter = new TinkerMapEmitter<>(mapReduce.doStage(MapReduce.Stage.REDUCE)); - final SynchronizedIterator vertices = new SynchronizedIterator<>(this.graph.vertices()); - workers.setMapReduce(mapReduce); - workers.mapReduceWorkerStart(MapReduce.Stage.MAP); - workers.executeMapReduce(workerMapReduce -> { - while (true) { - final Vertex vertex = vertices.next(); - if (null == vertex) return; - workerMapReduce.map(ComputerGraph.mapReduce(vertex), mapEmitter); - } - }); - workers.mapReduceWorkerEnd(MapReduce.Stage.MAP); - - // sort results if a map output sort is defined - mapEmitter.complete(mapReduce); - - // no need to run combiners as this is single machine - if (mapReduce.doStage(MapReduce.Stage.REDUCE)) { - final TinkerReduceEmitter reduceEmitter = new TinkerReduceEmitter<>(); - final SynchronizedIterator>> keyValues = new SynchronizedIterator((Iterator) mapEmitter.reduceMap.entrySet().iterator()); - workers.mapReduceWorkerStart(MapReduce.Stage.REDUCE); - workers.executeMapReduce(workerMapReduce -> { - while (true) { - final Map.Entry> entry = keyValues.next(); - if (null == entry) return; - workerMapReduce.reduce(entry.getKey(), entry.getValue().iterator(), reduceEmitter); - } - }); - workers.mapReduceWorkerEnd(MapReduce.Stage.REDUCE); - reduceEmitter.complete(mapReduce); // sort results if a reduce output sort is defined - mapReduce.addResultToMemory(this.memory, reduceEmitter.reduceQueue.iterator()); <1> - } else { - mapReduce.addResultToMemory(this.memory, mapEmitter.mapQueue.iterator()); <2> - } - } -} -... ----- - -<1> Note that the final results of the reducer are provided to the Memory as specified by the application developer's -`MapReduce.addResultToMemory()` implementation. -<2> If there is no reduce stage, the the map-stage results are inserted into Memory as specified by the application -developer's `MapReduce.addResultToMemory()` implementation. - -[[io-implementations]] -IO Implementations -^^^^^^^^^^^^^^^^^^ - -If a `Graph` requires custom serializers for IO to work properly, implement the `Graph.io` method. A typical example -of where a `Graph` would require such a custom serializers is if their identifier system uses non-primitive values, -such as OrientDB's `Rid` class. From basic serialization of a single `Vertex` all the way up the stack to Gremlin -Server, the need to know how to handle these complex identifiers is an important requirement. - -The first step to implementing custom serializers is to first implement the `IoRegistry` interface and register the -custom classes and serializers to it. Each `Io` implementation has different requirements for what it expects from the -`IoRegistry`: - -* *GraphML* - No custom serializers expected/allowed. -* *GraphSON* - Register a Jackson `SimpleModule`. The `SimpleModule` encapsulates specific classes to be serialized, -so it does not need to be registered to a specific class in the `IoRegistry` (use `null`). -* *Gryo* - Expects registration of one of three objects: -** Register just the custom class with a `null` Kryo `Serializer` implementation - this class will use default "field-level" Kryo serialization. -** Register the custom class with a specific Kryo `Serializer' implementation. -** Register the custom class with a `Function` for those cases where the Kryo `Serializer` requires the `Kryo` instance to get constructed. - -This implementation should provide a zero-arg constructor as the stack may require instantiation via reflection. -Consider extending `AbstractIoRegistry` for convenience as follows: - -[source,java] ----- -public class MyGraphIoRegistry extends AbstractIoRegistry { - public MyGraphIoRegistry() { - register(GraphSONIo.class, null, new MyGraphSimpleModule()); - register(GryoIo.class, MyGraphIdClass.class, new MyGraphIdSerializer()); - } -} ----- - -In the `Graph.io` method, provide the `IoRegistry` object to the supplied `Builder` and call the `create` method to -return that `Io` instance as follows: - -[source,java] ----- -public I io(final Io.Builder builder) { - return (I) builder.graph(this).registry(myGraphIoRegistry).create(); -}} ----- - -In this way, `Graph` implementations can pre-configure custom serializers for IO interactions and users will not need -to know about those details. Following this pattern will ensure proper execution of the test suite as well as -simplified usage for end-users. - -IMPORTANT: Proper implementation of IO is critical to successful `Graph` operations in Gremlin Server. The Test Suite -does have "serialization" tests that provide some assurance that an implementation is working properly, but those -tests cannot make assertions against any specifics of a custom serializer. It is the responsibility of the -implementer to test the specifics of their custom serializers. - -TIP: Consider separating serializer code into its own module, if possible, so that clients that use the `Graph` -implementation remotely don't need a full dependency on the entire `Graph` - just the IO components and related -classes being serialized. - -[[validating-with-gremlin-test]] -Validating with Gremlin-Test -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -image:gremlin-edumacated.png[width=225] - -[source,xml] - - org.apache.tinkerpop - gremlin-test - x.y.z - - - org.apache.tinkerpop - gremlin-groovy-test - x.y.z - - -The operational semantics of any OLTP or OLAP implementation are validated by `gremlin-test` and functional -interoperability with the Groovy environment is ensured by `gremlin-groovy-test`. To implement these tests, provide -test case implementations as shown below, where `XXX` below denotes the name of the graph implementation (e.g. -TinkerGraph, Neo4jGraph, HadoopGraph, etc.). - -[source,java] ----- -// Structure API tests -@RunWith(StructureStandardSuite.class) -@GraphProviderClass(provider = XXXGraphProvider.class, graph = XXXGraph.class) -public class XXXStructureStandardTest {} - -// Process API tests -@RunWith(ProcessComputerSuite.class) -@GraphProviderClass(provider = XXXGraphProvider.class, graph = XXXGraph.class) -public class XXXProcessComputerTest {} - -@RunWith(ProcessStandardSuite.class) -@GraphProviderClass(provider = XXXGraphProvider.class, graph = XXXGraph.class) -public class XXXProcessStandardTest {} - -@RunWith(GroovyEnvironmentSuite.class) -@GraphProviderClass(provider = XXXProvider.class, graph = TinkerGraph.class) -public class XXXGroovyEnvironmentTest {} - -@RunWith(GroovyProcessStandardSuite.class) -@GraphProviderClass(provider = XXXGraphProvider.class, graph = TinkerGraph.class) -public class XXXGroovyProcessStandardTest {} - -@RunWith(GroovyProcessComputerSuite.class) -@GraphProviderClass(provider = XXXGraphComputerProvider.class, graph = TinkerGraph.class) -public class XXXGroovyProcessComputerTest {} ----- - -The above set of tests represent the minimum test suite set to implement. There are other "integration" and -"performance" tests that should be considered optional. Implementing those tests requires the same pattern as shown above. - -IMPORTANT: It is as important to look at "ignored" tests as it is to look at ones that fail. The `gremlin-test` -suite utilizes the `Feature` implementation exposed by the `Graph` to determine which tests to execute. If a test -utilizes features that are not supported by the graph, it will ignore them. While that may be fine, implementers -should validate that the ignored tests are appropriately bypassed and that there are no mistakes in their feature -definitions. Moreover, implementers should consider filling gaps in their own test suites, especially when -IO-related tests are being ignored. - -The only test-class that requires any code investment is the `GraphProvider` implementation class. This class is a -used by the test suite to construct `Graph` configurations and instances and provides information about the -implementation itself. In most cases, it is best to simply extend `AbstractGraphProvider` as it provides many -default implementations of the `GraphProvider` interface. - -Finally, specify the test suites that will be supported by the `Graph` implementation using the `@Graph.OptIn` -annotation. See the `TinkerGraph` implementation below as an example: - -[source,java] ----- -@Graph.OptIn(Graph.OptIn.SUITE_STRUCTURE_STANDARD) -@Graph.OptIn(Graph.OptIn.SUITE_PROCESS_STANDARD) -@Graph.OptIn(Graph.OptIn.SUITE_PROCESS_COMPUTER) -@Graph.OptIn(Graph.OptIn.SUITE_GROOVY_PROCESS_STANDARD) -@Graph.OptIn(Graph.OptIn.SUITE_GROOVY_PROCESS_COMPUTER) -@Graph.OptIn(Graph.OptIn.SUITE_GROOVY_ENVIRONMENT) -public class TinkerGraph implements Graph { ----- - -Only include annotations for the suites the implementation will support. Note that implementing the suite, but -not specifying the appropriate annotation will prevent the suite from running (an obvious error message will appear -in this case when running the mis-configured suite). - -There are times when there may be a specific test in the suite that the implementation cannot support (despite the -features it implements) or should not otherwise be executed. It is possible for implementers to "opt-out" of a test -by using the `@Graph.OptOut` annotation. The following is an example of this annotation usage as taken from -`HadoopGraph`: - -[source, java] ----- -@Graph.OptIn(Graph.OptIn.SUITE_PROCESS_STANDARD) -@Graph.OptIn(Graph.OptIn.SUITE_PROCESS_COMPUTER) -@Graph.OptOut( - test = "org.apache.tinkerpop.gremlin.process.graph.step.map.MatchTest$Traversals", - method = "g_V_matchXa_hasXname_GarciaX__a_inXwrittenByX_b__a_inXsungByX_bX", - reason = "Hadoop-Gremlin is OLAP-oriented and for OLTP operations, linear-scan joins are required. This particular tests takes many minutes to execute.") -@Graph.OptOut( - test = "org.apache.tinkerpop.gremlin.process.graph.step.map.MatchTest$Traversals", - method = "g_V_matchXa_inXsungByX_b__a_inXsungByX_c__b_outXwrittenByX_d__c_outXwrittenByX_e__d_hasXname_George_HarisonX__e_hasXname_Bob_MarleyXX", - reason = "Hadoop-Gremlin is OLAP-oriented and for OLTP operations, linear-scan joins are required. This particular tests takes many minutes to execute.") -@Graph.OptOut( - test = "org.apache.tinkerpop.gremlin.process.computer.GraphComputerTest", - method = "shouldNotAllowBadMemoryKeys", - reason = "Hadoop does a hard kill on failure and stops threads which stops test cases. Exception handling semantics are correct though.") -@Graph.OptOut( - test = "org.apache.tinkerpop.gremlin.process.computer.GraphComputerTest", - method = "shouldRequireRegisteringMemoryKeys", - reason = "Hadoop does a hard kill on failure and stops threads which stops test cases. Exception handling semantics are correct though.") -public class HadoopGraph implements Graph { ----- - -The above examples show how to ignore individual tests. It is also possible to: - -* Ignore an entire test case (i.e. all the methods within the test) by setting the `method` to "*". -* Ignore a "base" test class such that test that extend from those classes will all be ignored. This style of -ignoring is useful for Gremlin "process" tests that have bases classes that are extended by various Gremlin flavors (e.g. groovy). -* Ignore a `GraphComputer` test based on the type of `GraphComputer` being used. Specify the "computer" attribute on -the `OptOut` (which is an array specification) which should have a value of the `GraphComputer` implementation class -that should ignore that test. This attribute should be left empty for "standard" execution and by default all -`GraphComputer` implementations will be included in the `OptOut` so if there are multiple implementations, explicitly -specify the ones that should be excluded. - -Also note that some of the tests in the Gremlin Test Suite are parameterized tests and require an additional level of -specificity to be properly ignored. To ignore these types of tests, examine the name template of the parameterized -tests. It is defined by a Java annotation that looks like this: - -[source, java] -@Parameterized.Parameters(name = "expect({0})") - -The annotation above shows that the name of each parameterized test will be prefixed with "expect" and have -parentheses wrapped around the first parameter (at index 0) value supplied to each test. This information can -only be garnered by studying the test set up itself. Once the pattern is determined and the specific unique name of -the parameterized test is identified, add it to the `specific` property on the `OptOut` annotation in addition to -the other arguments. - -These annotations help provide users a level of transparency into test suite compliance (via the -xref:describe-graph[describeGraph()] utility function). It also allows implementers to have a lot of flexibility in -terms of how they wish to support TinkerPop. For example, maybe there is a single test case that prevents an -implementer from claiming support of a `Feature`. The implementer could choose to either not support the `Feature` -or to support it but "opt-out" of the test with a "reason" as to why so that users understand the limitation. - -IMPORTANT: Before using `OptOut` be sure that the reason for using it is sound and it is more of a last resort. -It is possible that a test from the suite doesn't properly represent the expectations of a feature, is too broad or -narrow for the semantics it is trying to enforce or simply contains a bug. Please consider raising issues in the -developer mailing list with such concerns before assuming `OptOut` is the only answer. - -IMPORTANT: There are no tests that specifically validate complete compliance with Gremlin Server. Generally speaking, -a `Graph` that passes the full Test Suite, should be compliant with Gremlin Server. The one area where problems can -occur is in serialization. Always ensure that IO is properly implemented, that custom serializers are tested fully -and ultimately integration test the `Graph` with an actual Gremlin Server instance. - -CAUTION: Configuring tests to run in parallel might result in errors that are difficult to debug as there is some -shared state in test execution around graph configuration. It is therefore recommended that parallelism be turned -off for the test suite (the Maven SureFire Plugin is configured this way by default). It may also be important to -include this setting, `false`, in the SureFire configuration if tests are failing in an -unexplainable way. - -Accessibility via GremlinPlugin -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -image:gremlin-plugin.png[width=100,float=left] The applications distributed with TinkerPop3 do not distribute with -any graph system implementations besides TinkerGraph. If your implementation is stored in a Maven repository (e.g. -Maven Central Repository), then it is best to provide a `GremlinPlugin` implementation so the respective jars can be -downloaded according and when required by the user. Neo4j's GremlinPlugin is provided below for reference. - -[source,java] ----- -public class Neo4jGremlinPlugin implements GremlinPlugin { - - private static final String IMPORT = "import "; - private static final String DOT_STAR = ".*"; - - private static final Set IMPORTS = new HashSet() {{ - add(IMPORT + Neo4jGraph.class.getPackage().getName() + DOT_STAR); - }}; - - @Override - public String getName() { - return "neo4j"; - } - - @Override - public void pluginTo(final PluginAcceptor pluginAcceptor) { - pluginAcceptor.addImports(IMPORTS); - } -} ----- - -With the above plugin implementations, users can now download respective binaries for Gremlin Console, Gremlin Server, etc. - -[source,groovy] -gremlin> g = Neo4jGraph.open('/tmp/neo4j') -No such property: Neo4jGraph for class: groovysh_evaluate -Display stack trace? [yN] -gremlin> :install org.apache.tinkerpop neo4j-gremlin x.y.z -==>loaded: [org.apache.tinkerpop, neo4j-gremlin, …] -gremlin> :plugin use tinkerpop.neo4j -==>tinkerpop.neo4j activated -gremlin> g = Neo4jGraph.open('/tmp/neo4j') -==>neo4jgraph[EmbeddedGraphDatabase [/tmp/neo4j]] - -In-Depth Implementations -~~~~~~~~~~~~~~~~~~~~~~~~ - -image:gremlin-painting.png[width=200,float=right] The graph system implementation details presented thus far are -minimum requirements necessary to yield a valid TinkerPop3 implementation. However, there are other areas that a -graph system provider can tweak to provide an implementation more optimized for their underlying graph engine. Typical -areas of focus include: - -* Traversal Strategies: A <> can be used to alter a traversal prior to its -execution. A typical example is converting a pattern of `g.V().has('name','marko')` into a global index lookup for -all vertices with name "marko". In this way, a `O(|V|)` lookup becomes an `O(log(|V|))`. Please review -`TinkerGraphStepStrategy` for ideas. -* Step Implementations: Every <> is ultimately referenced by the `GraphTraversal` -interface. It is possible to extend `GraphTraversal` to use a graph system specific step implementation. +TinkerPop offers several reference implementations of its interfaces that are not only meant for production usage, +but also represent models by which different graph providers can build their systems. More specific documentation +on how to build systems at this level of the API can be found in the +link:http://tinkerpop.apache.org/docs/3.1.1-incubating/dev/provider/[Provider Documentation]. The following sections +describe the various reference implementations and their usage. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/05268192/pom.xml ---------------------------------------------------------------------- diff --git a/pom.xml b/pom.xml index 311ddad..11dd2ac 100644 --- a/pom.xml +++ b/pom.xml @@ -739,6 +739,31 @@ limitations under the License. + provider-book + generate-resources + + process-asciidoc + + + ${asciidoc.input.dir}/dev/provider + index.asciidoc + ${htmlsingle.output.dir}/dev/provider + html5 + book + + ../../images + UTF-8 + true + 3 + left + ${asciidoctor.style.dir} + tinkerpop.css + coderay + ${project.basedir} + + + + tutorial-getting-started generate-resources