tinkerpop-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jorg...@apache.org
Subject [tinkerpop] 01/05: GraphBinary specification
Date Fri, 26 Oct 2018 09:33:36 GMT
This is an automated email from the ASF dual-hosted git repository.

jorgebg pushed a commit to branch TINKERPOP-1942
in repository https://gitbox.apache.org/repos/asf/tinkerpop.git

commit 2fc2683c0d518e2a27b8b3eba9c136be1d9ee9e8
Author: Jorge Bay Gondra <jorgebaygondra@gmail.com>
AuthorDate: Thu Apr 19 11:35:47 2018 +0200

    GraphBinary specification
---
 docs/src/dev/io/graphbinary.asciidoc | 616 +++++++++++++++++++++++++++++++++++
 docs/src/dev/io/index.asciidoc       |   2 +
 2 files changed, 618 insertions(+)

diff --git a/docs/src/dev/io/graphbinary.asciidoc b/docs/src/dev/io/graphbinary.asciidoc
new file mode 100644
index 0000000..2412a5d
--- /dev/null
+++ b/docs/src/dev/io/graphbinary.asciidoc
@@ -0,0 +1,616 @@
+////
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+
+////
+
+[[graphbinary]]
+= GraphBinary
+
+GraphBinary is a binary serialization format that is designed to reduce serialization overhead
on both the client
+and the server, as well as limiting the size of the payload that is transmitted over the
wire.
+
+It describes arbitrary object graphs with a fully-qualified format:
+
+[source]
+----
+{type_code}{value}
+----
+
+Where:
+
+- `{type_code}` is a single byte representing the type number.
+- `{value}` is a sequence of bytes which content is determined by the type.
+
+All encodings are big-endian.
+
+Quick examples, using hexadecimal notation to represent each byte:
+
+- `01 00 00 00 01`: a 32-bit integer number, that represents the decimal number 1. It’s
composed by the
+type_code `0x01` and four bytes to describe the value.
+- `01 00 00 00 ff`: a 32-bit integer, representing the 256.
+- `02 00 00 00 00 00 00 00 01`: a 64-bit integer number 1. It’s composed by the type_code
`0x02` and eight bytes
+to describe the value.
+
+== Version 1.0
+
+=== Forward Compatibility
+
+The serialization format supports new types being added without the need to introduce a new
version.
+
+Changes to existing types require new revision.
+
+=== Request Message
+
+Represents a message from the client to the server.
+
+Format: `{version}{request_id}{op}{processor}{args}`
+
+Where:
+
+- `{version}` is a `Byte` representing the protocol version, with the most significant bit
set to one. For this version of the protocol, the value expected is `0x81` (`10000001`).
+- `{request_id}` is a `UUID`.
+- `{op}` is a `String`.
+- `{processor}` is a `String`.
+- `{args}` is a `Map`.
+
+The total length is not part of the message as the transport layer will provide it. For example:
WebSockets,
+as a framing protocol, defines payload length.
+
+=== Response Message
+
+Format: `{version}{id_present}{request_id}{status_code}{status_message}{status_attributes}{result_meta}{result_data}`
+
+Where:
+
+- `{version}` is a `Byte` representing the protocol version, with the most significant bit
set to one. For this version of the protocol, the value expected is `0x81` (`10000001`).
+- `{id_present}` is a single `Byte` representing whether a request id is present with only
two possible values 0 and 1.
+- `{request_id}` is a `UUID`.
+- `{status_code}` is an `Int`.
+- `{status_message}` is a `String`.
+- `{status_attributes}` is a `Map`.
+- `{result_meta}` is a `Map`.
+- `{result_data}` is a fully qualified typed value composed of `{type_code}{value}`.
+
+The total length is not part of the message as the transport layer will provide it.
+
+=== Data Type Codes
+
+==== Core Data Types
+
+- `0x01`: Int
+- `0x02`: Long
+- `0x03`: String
+- `0x04`: Date
+- `0x05`: Timestamp
+- `0x06`: Class
+- `0x07`: Double
+- `0x08`: Float
+- `0x09`: List
+- `0x0a`: Map
+- `0x0b`: Set
+- `0x0c`: UUID
+- `0x0d`: Edge
+- `0x0e`: Path
+- `0x0f`: Property
+- `0x10`: TinkerGraph
+- `0x11`: Vertex
+- `0x12`: VertexProperty
+- `0x13`: Barrier
+- `0x14`: Binding
+- `0x15`: Bytecode
+- `0x16`: Cardinality
+- `0x17`: Column
+- `0x18`: Direction
+- `0x19`: Operator
+- `0x1a`: Order
+- `0x1b`: Pick
+- `0x1c`: Pop
+- `0x1d`: Lambda
+- `0x1e`: P
+- `0x1f`: Scope
+- `0x20`: T
+- `0x21`: Traverser
+- `0x22`: BigDecimal
+- `0x23`: BigInteger
+- `0x24`: Byte
+- `0x25`: ByteBuffer
+- `0x26`: Short
+- `0x00`: Custom
+
+==== Extended Types
+
+- `0x80`: Char
+- `0x81`: Duration
+- `0x82`: InetAddress
+- `0x83`: Instant
+- `0x84`: LocalDate
+- `0x85`: LocalDateTime
+- `0x86`: LocalTime
+- `0x87`: MonthDay
+- `0x88`: OffsetDateTime
+- `0x89`: OffsetTime
+- `0x90`: Period
+- `0x92`: Year
+- `0x93`: YearMonth
+- `0x94`: ZonedDateTime
+- `0x95`: ZoneOffset
+
+=== Data Type Formats
+
+==== Int
+
+Format: 4-byte two's complement integer.
+
+Example values:
+
+- `00 00 00 01`: 32-bit integer number 1.
+- `00 00 01 01`: 32-bit integer number 256.
+- `ff ff ff ff`: 32-bit integer number -1.
+- `ff ff ff fe`: 32-bit integer number -2.
+
+==== Long
+
+Format: 4-byte two's complement integer.
+
+Example values
+
+- `00 00 00 00 00 00 00 01`: 64-bit integer number 1.
+- `ff ff ff ff ff ff ff fe`: 64-bit integer number -2.
+
+==== String
+
+Format: `{length}{text_value}`
+
+Where:
+
+- `{length}` is an `Int` describing the byte length of the text. Negative value -1 represents
the null string.
+- `{text_value}` is a sequence of bytes representing the string value in UTF8 encoding.
+
+Example values
+
+- `00 00 00 03 61 62 63`: the string 'abc'.
+- `00 00 00 04 61 62 63 64`: the string 'abcd'.
+
+==== Date
+
+Format: An 8-byte two's complement signed integer representing a millisecond-precision offset
from the unix epoch.
+
+Example values
+
+- `00 00 00 00 00 00 00 00`: The moment in time 1970-01-01T00:00:00.000Z.
+- `ff ff ff ff ff ff ff ff`: The moment in time 1969-12-31T23:59:59.999Z.
+
+==== Timestamp
+
+Format: The same as `Date`.
+
+==== Class
+
+Format: A `String` containing the fqcn.
+
+==== Double
+
+Format: 8 bytes representing IEEE 754 double-precision binary floating-point format.
+
+Example values
+
+- `3f f0 00 00 00 00 00 00`: Double 1
+- `3f 70 00 00 00 00 00 00`: Double 0.00390625
+- `3f b9 99 99 99 99 99 9a`: Double 0.1
+
+==== Float
+
+Format: 4 bytes representing IEEE 754 single-precision binary floating-point format.
+
+Example values
+
+- `3f 80 00 00`: Float 1
+- `3e c0 00 00`: Float 0.375
+
+==== List
+
+An ordered collection of items.
+
+Format: `{length}{item_0}...{item_n}`
+
+Where:
+
+- `{length}` is an unsigned 4-byte two's complement integer describing the length of the
list.
+- `{item_0}...{item_n}` are the items of the list. `{item_i}` is a fully qualified typed
value composed of `{type_code}{value}`.
+
+==== Set
+
+A collection that contains no duplicate elements.
+
+Format: Same as `List`.
+
+==== Map
+
+A dictionary of keys to values.
+
+Format: `{length}{item_0}...{item_n}`
+
+Where:
+
+- `{length}` is an unsigned 4-byte two's complement integer describing the length of the
map.
+- `{item_0}...{item_n}` are the items of the map. `{item_i}` is sequence of 2 fully qualified
typed values one representing the key and the following representing the value, each composed
composed of `{type_code}{value}`.
+
+==== UUID
+
+A 128-bit universally unique identifier.
+
+Format: 16 bytes representing the uuid.
+
+Example
+
+- `00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff`: Uuid 00112233-4455-6677-8899-aabbccddeeff.
+
+==== Edge
+
+Format: `{id}{label}{inVLabel}{outVLabel}{inV}{outV}{properties}`
+
+Where:
+
+- `{id}` is a fully qualified typed value composed of `{type_code}{value}`.
+- `{label}` is a `String` value.
+- `{inVLabel}` is a `String` value.
+- `{outVLabel}` is a `String` value.
+- `{inV}` is a fully qualified typed value composed of `{type_code}{value}`.
+- `{outV}` is a fully qualified typed value composed of `{type_code}{value}`.
+- `{properties}` is a `List` of `VertexProperty` items.
+
+==== Path
+
+Format: `{labels}{objects}`
+
+Where:
+
+- `{labels}` is a `List` in which each item is a `Set` of `String`.
+- `{objects}` is a `List` of fully qualified typed values.
+
+==== Property
+
+Format: `{key}{value}`
+
+Where:
+
+- `{key}` is a `String` value.
+- `{value}`  is a fully qualified typed value composed of `{type_code}{value}`.
+
+==== TinkerGraph
+
+A collection of vertices and edges.
+
+Format: `{vertices}{edges}`
+
+Where:
+
+- `{vertices}` is a `List` in which each item is a `Vertex`.
+- `{edges}` is a `List` in which each item is a `Edge`.
+
+==== Vertex
+
+Format: `{id}{label}{properties}`
+
+Where:
+
+- `{id}` is a fully qualified typed value composed of `{type_code}{value}`.
+- `{label}` is a `String` value.
+- `{properties}` is a `List` of `VertexProperty` items.
+
+==== VertexProperty
+
+Format: `{id}{label}{value}`
+
+Where:
+
+- `{id}` is a fully qualified typed value composed of `{type_code}{value}`.
+- `{label}` is a `String` value.
+- `{value}` is a fully qualified typed value composed of `{type_code}{value}`.
+
+==== Barrier
+
+Format: a single `String` representing the enum value.
+
+==== Binding
+
+Format: `{key}{value}`
+
+Where:
+
+- `{key}` is a `String` value.
+- `{value}` is a fully qualified typed value composed of `{type_code}{value}`.
+
+==== Bytecode
+
+Format: `{steps_length}{step_0}...{step_n}{sources_length}{source_0}...{source_n}`
+
+Where:
+
+* `{steps_length}` is an `Int` value describing the amount of steps.
+* `{step_i}` is composed of `{name}{values_length}{value_0}...{value_n}`, where:
+** `{name}` is a String.
+** `{values_length}` is an `Int` describing the amount values.
+** `{value_i}` is a fully qualified typed value composed of `{type_code}{value}` describing
the step argument.
+* `{sources_length}` is an `Int` value describing the amount of source instructions.
+* `{source_i}` is composed of `{name}{values_length}{value_0}...{value_n}`, where:
+** `{name}` is a `String`.
+** `{values_length}` is an `Int` describing the amount values.
+** `{value_i}`  is a fully qualified typed value composed of `{type_code}{value}`.
+
+==== Cardinality
+
+Format: a single `String` representing the enum value.
+
+==== Column
+
+Format: a single `String` representing the enum value.
+
+==== Direction
+
+Format: a single `String` representing the enum value.
+
+==== Operator
+
+Format: a single `String` representing the enum value.
+
+==== Order
+
+Format: a single `String` representing the enum value.
+
+==== Pick
+
+Format: a single `String` representing the enum value.
+
+==== Pop
+
+Format: a single `String` representing the enum value.
+
+==== Lambda
+
+Format: `{language}{script}{arguments_length}`
+Where:
+
+- `{language}` is a `String`.
+- `{script}` is a `String`.
+- `{arguments_length}` is an `Int`.
+
+==== P
+
+Format: `{predicate}{values_length}{value_0}...{value_n}`
+
+Where:
+
+- `{name}` is a String.
+- `{values_length}` is an `Int` describing the amount values.
+- `{value_i}` is a fully qualified typed value composed of `{type_code}{value}`.
+
+==== Scope
+
+Format: a single `String` representing the enum value.
+
+==== T
+
+Format: a single `String` representing the enum value.
+
+==== Traverser
+
+Format: `{bulk}{value}`
+
+Where:
+
+- `{name}` is an `Int`.
+- `{value}` is a fully qualified typed value composed of `{type_code}{value}`.
+
+==== BigDecimal
+
+Represents an arbitrary-precision signed decimal number, consisting of an arbitrary precision
integer unscaled value and a 32-bit integer scale.
+
+Format: `{scale}{unscaled_value}`
+
+Where:
+
+- `{scale}` is an `Int`.
+- `{unscaled_value}` is a `BigInteger`.
+
+==== BigInteger
+
+A variable-length two's complement encoding of a signed integer.
+
+Example values
+
+- `00`: Integer 0.
+- `01`: Integer 1.
+- `127`: Integer 7f.
+- `00 80`: Integer 128.
+- `ff`: Integer -1.
+- `80`: Integer -128.
+- `ff 7f`: Integer -129.
+
+==== Byte
+
+An unsigned 8-bit integer.
+
+==== ByteBuffer
+
+Format: `{length}{value}`
+
+Where:
+
+- `{length}` is an `Int` representing the amount of bytes contained in the value.
+- `{value}` sequence of bytes.
+
+==== Short
+
+Format: 2-byte two's complement integer.
+
+==== Custom
+
+A custom type, represented with a name and a blob value.
+
+Format: `{name}{blob}`
+
+Where:
+
+- `{name}` is `String`.
+- `{blob}` is a `ByteBuffer`.
+
+==== Char
+
+Format: one to four bytes representing a single UTF8 char, according to the Unicode standard.
+
+For characters `0x00`-`0x7F`, UTF-8 encodes the character as a single byte.
+
+For characters `0x80`-`0x7FF`, UTF-8 uses 2 bytes: the first byte is binary `110` followed
by the 5 high bits of the character, while the second byte is binary 10 followed by the 6
low bits of the character.
+
+The 3 and 4-byte encodings are similar to the 2-byte encoding, except that the first byte
of the 3-byte encoding starts with `1110` and the first byte of the 4-byte encoding starts
with `11110`.
+
+Example values (hex bytes)
+
+- `97`: Character 'a'.
+- `c2 a2`: Character '¢'.
+- `e2 82 ac`: Character '€'
+
+==== Duration
+
+A time-based amount of time.
+
+Format: `{seconds}{nanos}`
+
+Where:
+
+- `{seconds}` is a `Long`.
+- `{nanos}` is an `Int`.
+
+==== InetAddress
+
+Format: Same as `ByteBuffer`.
+
+==== Instant
+
+An instantaneous point on the time-line.
+
+Format: `{seconds}{nanos}`
+
+Where:
+
+- `{seconds}` is a `Long`.
+- `{nanos}` is an `Int`.
+
+==== LocalDate
+
+A date without a time-zone in the ISO-8601 calendar system.
+
+Format: `{year}{month}{day}`
+
+Where:
+
+- `{year}` is an `Int` from -999,999,999 to 999,999,999.
+- `{month}` is a `Byte` to represent, from 1 (January) to 12 (December)
+- `{day}` is a `Byte` from 1 to 31.
+
+==== LocalDateTime
+
+Format: `{date}{time}`
+
+Where:
+
+- `{date}` is `LocalDate`.
+- `{time}` is a `LocalTime`.
+
+==== LocalTime
+A time without a time-zone in the ISO-8601 calendar system.
+
+Format: An 8 byte two's complement long representing nanoseconds since midnight.
+
+Valid values are in the range 0 to 86399999999999
+
+==== MonthDay
+
+A month-day in the ISO-8601 calendar system.
+
+Format: `{month}{day}`
+
+Where:
+
+- `{month}` is `Byte` value from 1 to 12.
+- `{day}` is `Byte` value from 1 to 31.
+
+==== OffsetDateTime
+
+A date-time with an offset from UTC/Greenwich in the ISO-8601 calendar system, such as 2007-12-03T10:15:30+01:00.
+
+Format: `{local_date_time}{offset}`
+
+Where:
+
+- `{local_date_time}` is `LocalDateTime`.
+- `{offset}` is `ZoneOffset`.
+
+==== OffsetTime
+
+A time with an offset from UTC/Greenwich in the ISO-8601 calendar system, such as 10:15:30+01:00.
+
+Format: `{local_time}{offset}`
+
+Where:
+
+- `{local_time}` is `LocalTime`.
+- `{offset}` is `ZoneOffset`.
+
+==== Period
+
+A date-based amount of time in the ISO-8601 calendar system, such as '2 years, 3 months and
4 days'.
+
+Format: `{years}{month}{days}`
+
+Where:
+
+`{years}`, `{month}` and `{days}` are `Int` values.
+
+==== Year
+
+A year in the ISO-8601 calendar system, such as 2018.
+
+Format: An `Int` representing the years.
+
+==== YearMonth
+
+A year-month in the ISO-8601 calendar system, such as 2007-12.
+
+Format: `{year}{month}`
+
+Where:
+
+- `{year}` is an `Int`.
+- `{month} is a `Byte` from 1 to 12.
+
+==== ZonedDateTime
+
+A date-time with a time-zone in the ISO-8601 calendar system.
+
+Format: `{local_date_time}{zone_id}`
+
+Where:
+
+- `{local_date_time}` is `LocalDateTime`.
+- `{time}` is a `LocalTime`.
+
+==== ZoneOffset
+
+A time-zone offset from Greenwich/UTC, such as +02:00.
+
+Format: An `Int` representing total zone offset in seconds.
diff --git a/docs/src/dev/io/index.asciidoc b/docs/src/dev/io/index.asciidoc
index 5620fc3..73f5d5c 100644
--- a/docs/src/dev/io/index.asciidoc
+++ b/docs/src/dev/io/index.asciidoc
@@ -34,3 +34,5 @@ include::graphml.asciidoc[]
 include::graphson.asciidoc[]
 
 include::gryo.asciidoc[]
+
+include::graphbinary.asciidoc[]
\ No newline at end of file


Mime
View raw message