cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7970) JSON support for CQL
Date Sun, 01 Feb 2015 09:07:37 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300120#comment-14300120
] 

Robert Stupp commented on CASSANDRA-7970:
-----------------------------------------

I took a short look at the patch and played around a little bit with this.

Some comments:
* A test with null values in {{testFromJsonFct()}} and {{testToJsonFct()}} would be nice (toJson
handles null correctly, fromJson throws an NPE for {{INSERT INTO tojson (k, asciival ) VALUES
( 0, fromJson(null) );}})
* Can you add a {{INSERT JSON}} variant that tackles tuples and types?
* I tried {{insert into some_table (k, asciival ) JSON \{"k": 0, "asciival": "foobar"\};}}
in cqlsh - it complains with a syntax error (obviously my bad). Was wondering if we really
need to mention the column names since they are contained in the JSON. Couldn't a {{insert
into some_table JSON ?;}} with json {{"\{"k": 0, "asciival": "foobar"\}"}} also work?
* The Java Driver tries to contact a coordinator that owns the target partition. This gets
complicated with {{INSERT JSON}} since the driver would have to parse the JSON before it actually
knows the "correct" node. Maybe we can discuss a follow-up that both allows {{VALUES}} and
{{JSON}} - e.g. {{INSERT INTO some_table (partitionKey, clusteringKey) VALUES (1, 2) JSON
?}}, where the {{JSON ?}} part contains more columns.
* missing license header in {{Json.java}}
* In {{FunctionCall}} you exchanged „fun“ with „fun.name()“ eliminating the function
signature in the exception message
* {{Selection.rowToJson}} : use Collections.singleton instead of {{Arrays.asList}}
* {{Selection.rowToJson}} : would be nicer (and eliminate one String instance) to replace
sb.append(JSONValue.escape(columnName)) with something like JSONValue.escape(columnName, sb);
(make escape write to sb)
* Similar for {{AbstractType.toJSONString}} - let the implementation write to the string builder
* {{ReversedType}} : not sure whether the impl is correct - doesn’t it need to reverse the
binary representation?
* {{FromJsonFct}} + {{ToJsonFct}} : would be nicer to have a CHM instead of synchronized HashMap.
Also these maps in these classes have {{AbstractType}} as the key and are never evicted -
means: a changed user type would remain forever in these maps and if a UDT. This might get
irrelevant when {{AbstractType}} is removed - so not sure, whether we should fix this now
- maybe open a follow-up-ticket for this? A simple fix would be to 'statically cache' function
instances for primitive types but always create a new instance of these functions for tuples
+ UDTs ; then you could pre-create the function instances completely eliminating the need
for CHM or synchronized.

Altogether: really nice work :)


> JSON support for CQL
> --------------------
>
>                 Key: CASSANDRA-7970
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>             Fix For: 3.0
>
>         Attachments: 7970-trunk-v1.txt
>
>
> JSON is popular enough that not supporting it is becoming a competitive weakness.  We
can add JSON support in a way that is compatible with our performance goals by *mapping* JSON
to an existing schema: one JSON documents maps to one CQL row.
> Thus, it is NOT a goal to support schemaless documents, which is a misfeature [1] [2]
[3].  Rather, it is to allow a convenient way to easily turn a JSON document from a service
or a user into a CQL row, with all the validation that entails.
> Since we are not looking to support schemaless documents, we will not be adding a JSON
data type (CASSANDRA-6833) a la postgresql.  Rather, we will map the JSON to UDT, collections,
and primitive CQL types.
> Here's how this might look:
> {code}
> CREATE TYPE address (
>   street text,
>   city text,
>   zip_code int,
>   phones set<text>
> );
> CREATE TABLE users (
>   id uuid PRIMARY KEY,
>   name text,
>   addresses map<text, address>
> );
> INSERT INTO users JSON
> {‘id’: 4b856557-7153,
>    ‘name’: ‘jbellis’,
>    ‘address’: {“home”: {“street”: “123 Cassandra Dr”,
>                         “city”: “Austin”,
>                         “zip_code”: 78747,
>                         “phones”: [2101234567]}}};
> SELECT JSON id, address FROM users;
> {code}
> (We would also want to_json and from_json functions to allow mapping a single column's
worth of data.  These would not require extra syntax.)
> [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
> [2] https://blog.compose.io/schema-less-is-usually-a-lie/
> [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message