arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julien Le Dem (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARROW-372) Create JSON arrow file format for integration tests
Date Tue, 08 Nov 2016 22:39:58 GMT

    [ https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649014#comment-15649014
] 

Julien Le Dem commented on ARROW-372:
-------------------------------------

The json representation of the schema is definer here: https://github.com/apache/arrow/blob/master/format/Metadata.md#schemas
example:
{noformat}
  "schema" : {
    "fields" : [{
      "name" : "int",
      "nullable" : true,
      "type" : {
        "name" : "int",
        "bitWidth" : 32,
        "isSigned" : true
      },
      "children" : [ ],
      "typeLayout" : {
        "vectors" : [{
          "type" : "VALIDITY",
          "typeBitWidth" : 1
        },{
          "type" : "DATA",
          "typeBitWidth" : 32
        }]
      }
    },{
      "name" : "bigInt",
      "nullable" : true,
      "type" : {
        "name" : "int",
        "bitWidth" : 64,
        "isSigned" : true
      },
      "children" : [ ],
      "typeLayout" : {
        "vectors" : [{
          "type" : "VALIDITY",
          "typeBitWidth" : 1
        },{
          "type" : "DATA",
          "typeBitWidth" : 64
        }]
      }
    },{
      "name" : "list",
      "nullable" : true,
      "type" : {
        "name" : "list"
      },
      "children" : [{
        "nullable" : true,
        "type" : {
          "name" : "utf8"
        },
        "children" : [ ],
        "typeLayout" : {
          "vectors" : [{
            "type" : "VALIDITY",
            "typeBitWidth" : 1
          },{
            "type" : "OFFSET",
            "typeBitWidth" : 32
          },{
            "type" : "DATA",
            "typeBitWidth" : 8
          }]
        }
      }],
      "typeLayout" : {
        "vectors" : [{
          "type" : "VALIDITY",
          "typeBitWidth" : 1
        },{
          "type" : "OFFSET",
          "typeBitWidth" : 32
        }]
      }
    },{
      "name" : "map",
      "nullable" : false,
      "type" : {
        "name" : "struct"
      },
      "children" : [{
        "name" : "timestamp",
        "nullable" : true,
        "type" : {
          "name" : "timestamp",
          "unit" : "MILLISECOND"
        },
        "children" : [ ],
        "typeLayout" : {
          "vectors" : [{
            "type" : "VALIDITY",
            "typeBitWidth" : 1
          },{
            "type" : "DATA",
            "typeBitWidth" : 64
          }]
        }
      }],
      "typeLayout" : {
        "vectors" : [{
          "type" : "VALIDITY",
          "typeBitWidth" : 1
        }]
      }
    }]
  },
{noformat}

> Create JSON arrow file format for integration tests
> ---------------------------------------------------
>
>                 Key: ARROW-372
>                 URL: https://issues.apache.org/jira/browse/ARROW-372
>             Project: Apache Arrow
>          Issue Type: Task
>          Components: Java - Vectors
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>
> {noformat}
> {
>   "schema" : ...,
>   "batches" : [{
>     "count" : 10,
>     "columns" : [
>       {
>         "name": "{col_name_int}",
>         "count" : 10,
>         "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
>         "DATA" : [0,1,2,3,4,5,6,7,8,9]
>       },
>       { 
>         "name": "{col_name_list}",
>         "count" : 10,
>         "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
>         "OFFSET" : [0,0,1,3,3,4,6,6,7,9],
>         "children" : {
>           {
>             "name": "child_name",
>             "count" : 9,
>             "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
>             "OFFSET" : [0,3,6,9,12,15,18,21,24],
>             "DATA" : ["abc","abc","abc","abc","abc","abc","abc","abc","abc"]
>           }
>         }
>       },
>       {
>         "name": "{col_name_map}",
>         "count" : 10,
>         "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
>         "children" : {
>           {
>             "name": "{col_name_timestamp}",
>             "count" : 10,
>             "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
>             "DATA" : [0,1,2,3,4,5,6,7,8,9]
>           }
>         }
>       }
>     }, ... ]
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message