Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 17843200D20 for ; Tue, 17 Oct 2017 21:00:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 160551609EC; Tue, 17 Oct 2017 19:00:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5AA081609D9 for ; Tue, 17 Oct 2017 21:00:04 +0200 (CEST) Received: (qmail 4515 invoked by uid 500); 17 Oct 2017 19:00:03 -0000 Mailing-List: contact dev-help@thrift.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@thrift.apache.org Delivered-To: mailing list dev@thrift.apache.org Received: (qmail 4493 invoked by uid 99); 17 Oct 2017 19:00:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Oct 2017 19:00:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 679CE1A3322 for ; Tue, 17 Oct 2017 19:00:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id MdHagA5-1IOp for ; Tue, 17 Oct 2017 19:00:00 +0000 (UTC) Received: from mail-pg0-f46.google.com (mail-pg0-f46.google.com [74.125.83.46]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id A29A65F2AD for ; Tue, 17 Oct 2017 19:00:00 +0000 (UTC) Received: by mail-pg0-f46.google.com with SMTP id k7so2157179pga.3 for ; Tue, 17 Oct 2017 12:00:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=RTiltv+8en0w0i9gs7Nl/5eI+u7+9xkN64KaV032O+w=; b=KH5mkXJy/weV3Qyx203lb8SMvIfTiqKVBnAjWXQeyujMe5ff1h4n9pk5pp3SEKnwti iGcjlCNFDH86DDNLlNELpVzOruPTmO4SeTVCVx/yco/KjgAHzNiLeesvx4sn23fttD0q yi7kKq+rzWfnLS4pLAOausrcmlHCr86LVdVYPGwm00naIqu2LBuxv/ricvUEL9/4RhyZ tQenzf5XHyehmsioKBU8yAD11jTg0xUTGBywHvFNRVr9mLVdURs2eN2iwkNIb/wdxVES nvx0i35p7+/rYU/d2rm0aNyfjSxOd34oGqDncRjO9bM00+dvC9X7MG3OKGuFGOXFTSmJ iSlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=RTiltv+8en0w0i9gs7Nl/5eI+u7+9xkN64KaV032O+w=; b=Pi8qS1kjDXbLuW56FppRjgAgJW0eOODVDAjDH8GoIdTfCbdVE1NXIEYRm+PFTkelHC wc0iVfY7UVuISYakII4NXIrfx9fvKVo+Uq3ab9T7fFb+iTSgPWaRIWIb800B/dpg50VO uv3ruSVcpSHxcIUp0tgcs+4CKIcTWQKayLPo9lre9kZFzPTSe1/pp/iq0ylctFODIJ+g OQtj+8sIb7DRFSwiqf6H/iv6B0Lid1ylUZ+xpB+DWvclMjVOar72/EGlyFKKTxA2WW4I uMgxd8X6m6MSthkm5EvZYLdpcOmNwiko0G2igFmF8WE2/D1Vaa2ZuXnNi0C93Z7+s7KK L9wQ== X-Gm-Message-State: AMCzsaXU20VwYPPoN4Z5AY7g+JsF/FtSslorrbPfIHb5QLJQRBDgcL/3 dZNEPvvC4PjjMAsZLi5WR8RU8kdLPStGDR3MDLyD9g== X-Google-Smtp-Source: AOwi7QBPqpTKcfk/+mC2FVUxcyw+hIiRNc5jaBXeNHQ3EBivm/dMWr9L9xCFpdzhRLuZvQ5Q0scyr0VMVkT+eRMc7uY= X-Received: by 10.159.246.12 with SMTP id b12mr13069284pls.380.1508266799601; Tue, 17 Oct 2017 11:59:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.243.5 with HTTP; Tue, 17 Oct 2017 11:59:59 -0700 (PDT) From: Chet Murthy Date: Tue, 17 Oct 2017 11:59:59 -0700 Message-ID: Subject: "iterated container types" and a nicer JSON wire protocol (2 of 2) To: dev@thrift.apache.org Content-Type: multipart/alternative; boundary="089e082320dcd11dd2055bc2ba06" archived-at: Tue, 17 Oct 2017 19:00:05 -0000 --089e082320dcd11dd2055bc2ba06 Content-Type: text/plain; charset="UTF-8" [A second note to follow up on the description of my questions around a new JSON wire protocol.] The most .... "intractable" problem with a nicer wire protocol seems to be in dealing with iterated containers, and specifically with their types. Consider a message like: struct Foo { 1: required list> l ; } the generated read() code for member "l" looks like this: { uint32_t _size76; ::apache::thrift::protocol::TType _etype79; xfer += iprot->readListBegin(_etype79, _size76); uint32_t _i80; for (_i80 = 0; _i80 < _size76; ++_i80) { { uint32_t _size81; ::apache::thrift::protocol::TType _etype84; xfer += iprot->readListBegin(_etype84, _size81); uint32_t _i85; for (_i85 = 0; _i85 < _size81; ++_i85) { ... } xfer += iprot->readListEnd(); } } xfer += iprot->readListEnd(); } In this code, readListBegin() is supposed to return the size of the list, and the type of elements. The size gets used, but the type does not (after all, the generated code knows the type) get used HERE. BUT IT IS USED in the generic skip() template functions. So: (1) morally, it seems like readListBegin() MUST return a correct type (2) in fact, both binary & compact protocols do this -- further evidence that this is a contract that protocols should fulfil. But in any "nice" JSON protocol, it's going to be complicated to return that type. Consider a possible serialization of an instance of that struct: { "l": [ [1,2,3], [4,5,6] ] } demarshalling this could proceed as follows: (0) deserialize into a JSON DOM so we can find the sizes of the arrays above (1) then the calls: readStructBegin() readFieldBegin() // to read the "l" and ":", return field-id, field-type (1, T_LIST) readListBegin() // read first "[", return size, elem-type (2, T_LIST) readListBegin() // read second "[" return size, elem-type 3, (T_I32) readI32() // read "1" . etc .... (2) readFieldBegin() can use lookaside data to map the string "l" to <1, T_LIST> (3) with some trickery, the first readListBegin() could do the same (and since we deserialized into a JSON DOM already, computing size is easy) (4) BUT without keeping full recursive tree type-data-structures around at runtime, I don't see how the second readListBegin() can be properly/correctly implemented. And this is a simple case. If we consider something like map > >, it seems pretty clear that the JSON protocol module would have to be stepping thru a tree-structured state-machine in sync with the generated read() method. And that doesn't seem like a recipe for maintainability. I do think there's another way to solve this problem, but I don't want to address it until I've already closed-off ths particular avenue of investigation. Any comments/advice welcome. Thanks, --chet-- --089e082320dcd11dd2055bc2ba06--