ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Sapego <isap...@gridgain.com>
Subject Re: BinaryObject pros/cons
Date Mon, 31 Oct 2016 17:38:34 GMT
Vladimir,

How about some reserved value? I.e -1 offset means a default/null value
should be used?

Best Regards,
Igor

On Mon, Oct 31, 2016 at 5:05 PM, Vladimir Ozerov <vozerov@gridgain.com>
wrote:

> Valya,
>
> Do you have any ideas how to implement this? We write field offsets in the
> footer. If field is not written, then what should be used for its offset?
>
> On Mon, Oct 31, 2016 at 4:56 PM, Valentin Kulichenko <
> valentin.kulichenko@gmail.com> wrote:
>
> > Vladimir,
> >
> > These are good points, but I'm not suggesting to change the schema. If
> one
> > writes five fields, the schema should have five fields in any case,
> > regardless of values. I only suggest to change the internal
> representation
> > of the object and do not save fields with default values in the byte
> array
> > as we don't really need them there.
> >
> > -Val
> >
> > On Sun, Oct 30, 2016 at 12:24 PM, Vladimir Ozerov <vozerov@gridgain.com>
> > wrote:
> >
> >> Valya,
> >>
> >> I have several concerns:
> >> 1) Correctness: hasField() will not work properly. But probably we can
> >> fix that by adding this info to schema.
> >> 2) Performance: we have lots optimizations which depend on either
> >> "stable" object schema, or low number of schemas. We will effectively
> turn
> >> them off.
> >> But what concerns me even more, is that we may end up in enormous number
> >> of schemas. E.g. consider an object with 10 number fields. If all fields
> >> could be zero, we may end up in something like 2^10 schemas.
> >>
> >> Vladimir.
> >>
> >> 29 окт. 2016 г. 0:37 пользователь "Valentin Kulichenko" <
> >> valentin.kulichenko@gmail.com> написал:
> >>
> >> Vova,
> >>>
> >>> Why do we need to write zeros and nulls in the first place? What's the
> >>> value of having them in the byte array?
> >>>
> >>> -Val
> >>>
> >>> On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov <vozerov@gridgain.com
> >
> >>> wrote:
> >>>
> >>>> Valya,
> >>>>
> >>>> Currently null value is written as one byte, while zero value of long
> >>>> type is written as 9 bytes. I want to improve that and write zeros as
> one
> >>>> byte as well.
> >>>>
> >>>> As per var-length encoding, I am strongly against it. It saves IO and
> >>>> memory at the cost of CPU. If we encode numbers in this way we will
> >>>> slowdown SQL (which is already not very fast, to be honest). Because
> >>>> instead of a single read memory read, we will have to perform multiple
> >>>> reads and then apply some mechanics to restore original value. We
> already
> >>>> have such problem with Strings - Java stores them as UTF-16, but we
> encode
> >>>> them as UTF-8. As a result every read of a string field in SQL
> results in
> >>>> decoding overhead.
> >>>>
> >>>> Vladimir.
> >>>>
> >>>> On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko <
> >>>> valentin.kulichenko@gmail.com> wrote:
> >>>>
> >>>>> Cross-posting this to dev list.
> >>>>>
> >>>>> Vladimir,
> >>>>>
> >>>>> To be honest, I don't see much difference between null values for
> >>>>> objects and zero values for primitives. From BinaryObject semantics
> >>>>> standpoint, both are default values for corresponding types. These
> values
> >>>>> will be returned from the BinaryObject.field() method regardless
of
> whether
> >>>>> we actually save then in the byte array or not. Having said that,
> why don't
> >>>>> we just skip them during write?
> >>>>>
> >>>>> You optimization will be still useful though, because there are
often
> >>>>> a lot of ints and longs that are not zeros, but still small and
can
> fit 1-2
> >>>>> bytes. We already added such compaction in direct message marshaling
> and it
> >>>>> reduced overall traffic by around 30%.
> >>>>>
> >>>>> -Val
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov <
> vozerov@gridgain.com
> >>>>> > wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I am not very concerned with null fields overhead, because usually
> it
> >>>>>> won't be significant. However, there is a problem with zeros.
User
> object
> >>>>>> might have lots of int/long zeros, this is not uncommon. And
each
> zero will
> >>>>>> consume 4-8 additional bytes. We probably will implement special
> >>>>>> optimization which will write such fields in special compact
format.
> >>>>>>
> >>>>>> Vladimir.
> >>>>>>
> >>>>>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko <
> >>>>>> valentin.kulichenko@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Yes, null values consume memory. I believe this can be optimized,
> >>>>>>> but I
> >>>>>>> haven't seen issues with this so far. Unless you have hundreds
of
> >>>>>>> fields
> >>>>>>> most of which are nulls (very rare case), the overhead is
minimal.
> >>>>>>>
> >>>>>>> -Val
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> View this message in context: http://apache-ignite-users.705
> >>>>>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html
> >>>>>>> Sent from the Apache Ignite Users mailing list archive at
> Nabble.com.
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message