incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Alfke <j...@couchbase.com>
Subject Re: Duplicate fields in documents
Date Thu, 20 Feb 2014 04:30:15 GMT

On Feb 19, 2014, at 6:07 AM, Dave Cottlehuber <dch@jsonified.com> wrote:

> TL;DR the appropriately named ECMA 404 JSON spec [1] is broken or more politely, insufficiently
specific.

This seems to fall into the category of "things so obvious that the people who wrote the spec
didn't realize they had to mention them." I.e. "You can't have duplicate keys."

> JSON is typically based on a dictionary or hash map, and there’s no particular reason
for that data structure to enforce uniqueness of keys.

I disagree. Mathematically, a dictionary/map object is a function: it maps from a set of keys
to a set of values, with each key mapping to exactly one value. (That's basically the definition
of 'function'.) It's certainly possible to create implementations that map a key to _multiple_
values, but that's something different: it's a mapping from a key to a set. (For example,
it's not from string-->int, it's now from string-->set<int>.) The JSON spec does
not include this kind of mapping — an object value in JSON can be a number, but not a set
of numbers.

There _are_ data formats out there that explicitly support multiple values for a key. The
best-known one is probably MIME/HTTP headers. Parsers for this tend to use a representation
that's a mapping from a string to a set or array of strings.

IMHO the reasonable thing for a JSON parser to do if it encounters a duplicate key is to fail
with a clear error. Failing that, the only other reasonable option is to discard one or the
other  value (I don't have an opinion which.) But keeping both is unreasonable.

(The Erlang JSON parser is already being weird and nonstandard in preserving the order of
keys. This implicitly led to interoperability problems like the MIME multipart representation
of a document not having a clear mapping of attachment names to MIME bodies, because whoever
wrote it decided to put the MIME bodies in "the same order as" the attachment keys, not realizing
that there isn't any order.)

—Jens
Mime
View raw message