Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E447A200BAF for ; Mon, 31 Oct 2016 18:44:19 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id E2CC1160B0A; Mon, 31 Oct 2016 17:44:19 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 08FB0160AED for ; Mon, 31 Oct 2016 18:44:18 +0100 (CET) Received: (qmail 74260 invoked by uid 500); 31 Oct 2016 17:39:21 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 71570 invoked by uid 99); 31 Oct 2016 17:38:39 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 31 Oct 2016 17:38:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0CC0C180646 for ; Mon, 31 Oct 2016 17:38:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.479 X-Spam-Level: ** X-Spam-Status: No, score=2.479 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gridgain-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 5LlLGgm2MvW3 for ; Mon, 31 Oct 2016 17:38:36 +0000 (UTC) Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 052CE5F30D for ; Mon, 31 Oct 2016 17:38:35 +0000 (UTC) Received: by mail-wm0-f53.google.com with SMTP id a197so34083083wmd.0 for ; Mon, 31 Oct 2016 10:38:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gridgain-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=l3iiB1wKY8T3UrqIT97b1+R9aBiwIyJyLJKqOusDy+s=; b=youpOxh3TBJB8lp3vqc01SBgRSGkYbNwpMHgbHPtqGSFXOJ/U/rliqVcQRusqOauA3 bzdXyoORVtW2MjKkHb59NMdJUPKT0ZiTq2lrZHspmOlb88cvUbGL18B3rFD9N6eAJ1qK gcf2aWO8qESTaLZtj/NAb+fPJ7tVU2D8NfXBG6zI1fo+wU1/9qXH+0U37e/GXboTWUMp of6DbRZ8ACjgzCxEkguyUv98QkBJJP3R4KuTBzJiDyzSVey0arwX9z2XgfC19ZWvOEur 38UdN5n9Xy3dRf7Vd7f9WubE2ALyNRp4CyeW1y2sXezcwM/3uJtK4K2OJonnJdRVh2RS 13+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=l3iiB1wKY8T3UrqIT97b1+R9aBiwIyJyLJKqOusDy+s=; b=ZGvhRcwlpxX1WnRslQoWSE5QxkmN2iAXOuTTrIA8A+yWqngWF8RaiEK/esQ9XKxwTN 3B6cM50Mao5q2X/qjcnjtWtaE5HScFoDTjuDl4Z9PrU7rYdn9qPCiOQTlkqcoBDYigNd xOlTWTbVYEe2mTNCEiUUcSO9yXT+TD9yoM+27kLzOe/AJipmfdVvNPivOiuoRw61LP3/ BQws8bQiCK+daJZZEDbfrgVKkI0GnMKZbIoMCtMPjeZPD0Po4LnMBDFqPD4UO/hJdqlf 8dKwI6iwadBmvTAc+le8RPgq6rKsVv3a9S9OW5wDPUBelbSrjB2WOHSYJqkwdHrKxzd+ zADA== X-Gm-Message-State: ABUngvfPrlNddg7Z9EbYCxygB4ytChL50Rf/sqgiRr0RC6nASOO5PSic/gpkfZSJmYEnymkHbvSxCr4qhQgZr2dH X-Received: by 10.28.228.5 with SMTP id b5mr2884438wmh.129.1477935515545; Mon, 31 Oct 2016 10:38:35 -0700 (PDT) MIME-Version: 1.0 Received: by 10.80.186.227 with HTTP; Mon, 31 Oct 2016 10:38:34 -0700 (PDT) In-Reply-To: References: <000201d23022$319dc450$94d94cf0$@neulion.com.cn> <1477598139155-8563.post@n6.nabble.com> From: Igor Sapego Date: Mon, 31 Oct 2016 20:38:34 +0300 Message-ID: Subject: Re: BinaryObject pros/cons To: dev@ignite.apache.org Cc: Valentin Kulichenko Content-Type: multipart/alternative; boundary=001a114b253c67d1e905402cade5 archived-at: Mon, 31 Oct 2016 17:44:20 -0000 --001a114b253c67d1e905402cade5 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Vladimir, How about some reserved value? I.e -1 offset means a default/null value should be used? Best Regards, Igor On Mon, Oct 31, 2016 at 5:05 PM, Vladimir Ozerov wrote: > Valya, > > Do you have any ideas how to implement this? We write field offsets in th= e > footer. If field is not written, then what should be used for its offset? > > On Mon, Oct 31, 2016 at 4:56 PM, Valentin Kulichenko < > valentin.kulichenko@gmail.com> wrote: > > > Vladimir, > > > > These are good points, but I'm not suggesting to change the schema. If > one > > writes five fields, the schema should have five fields in any case, > > regardless of values. I only suggest to change the internal > representation > > of the object and do not save fields with default values in the byte > array > > as we don't really need them there. > > > > -Val > > > > On Sun, Oct 30, 2016 at 12:24 PM, Vladimir Ozerov > > wrote: > > > >> Valya, > >> > >> I have several concerns: > >> 1) Correctness: hasField() will not work properly. But probably we can > >> fix that by adding this info to schema. > >> 2) Performance: we have lots optimizations which depend on either > >> "stable" object schema, or low number of schemas. We will effectively > turn > >> them off. > >> But what concerns me even more, is that we may end up in enormous numb= er > >> of schemas. E.g. consider an object with 10 number fields. If all fiel= ds > >> could be zero, we may end up in something like 2^10 schemas. > >> > >> Vladimir. > >> > >> 29 =D0=BE=D0=BA=D1=82. 2016 =D0=B3. 0:37 =D0=BF=D0=BE=D0=BB=D1=8C=D0= =B7=D0=BE=D0=B2=D0=B0=D1=82=D0=B5=D0=BB=D1=8C "Valentin Kulichenko" < > >> valentin.kulichenko@gmail.com> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0= =BB: > >> > >> Vova, > >>> > >>> Why do we need to write zeros and nulls in the first place? What's th= e > >>> value of having them in the byte array? > >>> > >>> -Val > >>> > >>> On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov > > >>> wrote: > >>> > >>>> Valya, > >>>> > >>>> Currently null value is written as one byte, while zero value of lon= g > >>>> type is written as 9 bytes. I want to improve that and write zeros a= s > one > >>>> byte as well. > >>>> > >>>> As per var-length encoding, I am strongly against it. It saves IO an= d > >>>> memory at the cost of CPU. If we encode numbers in this way we will > >>>> slowdown SQL (which is already not very fast, to be honest). Because > >>>> instead of a single read memory read, we will have to perform multip= le > >>>> reads and then apply some mechanics to restore original value. We > already > >>>> have such problem with Strings - Java stores them as UTF-16, but we > encode > >>>> them as UTF-8. As a result every read of a string field in SQL > results in > >>>> decoding overhead. > >>>> > >>>> Vladimir. > >>>> > >>>> On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko < > >>>> valentin.kulichenko@gmail.com> wrote: > >>>> > >>>>> Cross-posting this to dev list. > >>>>> > >>>>> Vladimir, > >>>>> > >>>>> To be honest, I don't see much difference between null values for > >>>>> objects and zero values for primitives. From BinaryObject semantics > >>>>> standpoint, both are default values for corresponding types. These > values > >>>>> will be returned from the BinaryObject.field() method regardless of > whether > >>>>> we actually save then in the byte array or not. Having said that, > why don't > >>>>> we just skip them during write? > >>>>> > >>>>> You optimization will be still useful though, because there are oft= en > >>>>> a lot of ints and longs that are not zeros, but still small and can > fit 1-2 > >>>>> bytes. We already added such compaction in direct message marshalin= g > and it > >>>>> reduced overall traffic by around 30%. > >>>>> > >>>>> -Val > >>>>> > >>>>> > >>>>> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov < > vozerov@gridgain.com > >>>>> > wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I am not very concerned with null fields overhead, because usually > it > >>>>>> won't be significant. However, there is a problem with zeros. User > object > >>>>>> might have lots of int/long zeros, this is not uncommon. And each > zero will > >>>>>> consume 4-8 additional bytes. We probably will implement special > >>>>>> optimization which will write such fields in special compact forma= t. > >>>>>> > >>>>>> Vladimir. > >>>>>> > >>>>>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko < > >>>>>> valentin.kulichenko@gmail.com> wrote: > >>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> Yes, null values consume memory. I believe this can be optimized, > >>>>>>> but I > >>>>>>> haven't seen issues with this so far. Unless you have hundreds of > >>>>>>> fields > >>>>>>> most of which are nulls (very rare case), the overhead is minimal= . > >>>>>>> > >>>>>>> -Val > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> View this message in context: http://apache-ignite-users.705 > >>>>>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html > >>>>>>> Sent from the Apache Ignite Users mailing list archive at > Nabble.com. > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>> > >>> > > > --001a114b253c67d1e905402cade5--