ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: BinaryObject pros/cons
Date Mon, 31 Oct 2016 20:00:33 GMT
Igor,

Good catch. Probably some MAX value could help us here.

On Mon, Oct 31, 2016 at 9:17 PM, Igor Sapego <isapego@gridgain.com> wrote:

> Valentin,
>
> -1 was just an example. I've checked - currently we use all possible range
> of offset values.
> So if we are going to use suggested approach then we need to reserve some
> value and
> adjust serialization/deserialization algorithms.
>
> Best Regards,
> Igor
>
> On Mon, Oct 31, 2016 at 8:46 PM, Valentin Kulichenko <
> valentin.kulichenko@gmail.com> wrote:
>
> > Makes sense to me, but not sure about -1 in particular. Is this offset
> > relative to object start position? What values can it have?
> >
> > -Val
> >
> > On Mon, Oct 31, 2016 at 10:38 AM, Igor Sapego <isapego@gridgain.com>
> > wrote:
> >
> >> Vladimir,
> >>
> >> How about some reserved value? I.e -1 offset means a default/null value
> >> should be used?
> >>
> >> Best Regards,
> >> Igor
> >>
> >> On Mon, Oct 31, 2016 at 5:05 PM, Vladimir Ozerov <vozerov@gridgain.com>
> >> wrote:
> >>
> >>> Valya,
> >>>
> >>> Do you have any ideas how to implement this? We write field offsets in
> >>> the
> >>> footer. If field is not written, then what should be used for its
> offset?
> >>>
> >>> On Mon, Oct 31, 2016 at 4:56 PM, Valentin Kulichenko <
> >>> valentin.kulichenko@gmail.com> wrote:
> >>>
> >>> > Vladimir,
> >>> >
> >>> > These are good points, but I'm not suggesting to change the schema.
> If
> >>> one
> >>> > writes five fields, the schema should have five fields in any case,
> >>> > regardless of values. I only suggest to change the internal
> >>> representation
> >>> > of the object and do not save fields with default values in the byte
> >>> array
> >>> > as we don't really need them there.
> >>> >
> >>> > -Val
> >>> >
> >>> > On Sun, Oct 30, 2016 at 12:24 PM, Vladimir Ozerov <
> >>> vozerov@gridgain.com>
> >>> > wrote:
> >>> >
> >>> >> Valya,
> >>> >>
> >>> >> I have several concerns:
> >>> >> 1) Correctness: hasField() will not work properly. But probably
we
> can
> >>> >> fix that by adding this info to schema.
> >>> >> 2) Performance: we have lots optimizations which depend on either
> >>> >> "stable" object schema, or low number of schemas. We will
> effectively
> >>> turn
> >>> >> them off.
> >>> >> But what concerns me even more, is that we may end up in enormous
> >>> number
> >>> >> of schemas. E.g. consider an object with 10 number fields. If all
> >>> fields
> >>> >> could be zero, we may end up in something like 2^10 schemas.
> >>> >>
> >>> >> Vladimir.
> >>> >>
> >>> >> 29 окт. 2016 г. 0:37 пользователь "Valentin Kulichenko"
<
> >>> >> valentin.kulichenko@gmail.com> написал:
> >>> >>
> >>> >> Vova,
> >>> >>>
> >>> >>> Why do we need to write zeros and nulls in the first place?
What's
> >>> the
> >>> >>> value of having them in the byte array?
> >>> >>>
> >>> >>> -Val
> >>> >>>
> >>> >>> On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov <
> >>> vozerov@gridgain.com>
> >>> >>> wrote:
> >>> >>>
> >>> >>>> Valya,
> >>> >>>>
> >>> >>>> Currently null value is written as one byte, while zero
value of
> >>> long
> >>> >>>> type is written as 9 bytes. I want to improve that and
write zeros
> >>> as one
> >>> >>>> byte as well.
> >>> >>>>
> >>> >>>> As per var-length encoding, I am strongly against it. It
saves IO
> >>> and
> >>> >>>> memory at the cost of CPU. If we encode numbers in this
way we
> will
> >>> >>>> slowdown SQL (which is already not very fast, to be honest).
> Because
> >>> >>>> instead of a single read memory read, we will have to perform
> >>> multiple
> >>> >>>> reads and then apply some mechanics to restore original
value. We
> >>> already
> >>> >>>> have such problem with Strings - Java stores them as UTF-16,
but
> we
> >>> encode
> >>> >>>> them as UTF-8. As a result every read of a string field
in SQL
> >>> results in
> >>> >>>> decoding overhead.
> >>> >>>>
> >>> >>>> Vladimir.
> >>> >>>>
> >>> >>>> On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko <
> >>> >>>> valentin.kulichenko@gmail.com> wrote:
> >>> >>>>
> >>> >>>>> Cross-posting this to dev list.
> >>> >>>>>
> >>> >>>>> Vladimir,
> >>> >>>>>
> >>> >>>>> To be honest, I don't see much difference between null
values for
> >>> >>>>> objects and zero values for primitives. From BinaryObject
> semantics
> >>> >>>>> standpoint, both are default values for corresponding
types.
> These
> >>> values
> >>> >>>>> will be returned from the BinaryObject.field() method
regardless
> >>> of whether
> >>> >>>>> we actually save then in the byte array or not. Having
said that,
> >>> why don't
> >>> >>>>> we just skip them during write?
> >>> >>>>>
> >>> >>>>> You optimization will be still useful though, because
there are
> >>> often
> >>> >>>>> a lot of ints and longs that are not zeros, but still
small and
> >>> can fit 1-2
> >>> >>>>> bytes. We already added such compaction in direct message
> >>> marshaling and it
> >>> >>>>> reduced overall traffic by around 30%.
> >>> >>>>>
> >>> >>>>> -Val
> >>> >>>>>
> >>> >>>>>
> >>> >>>>> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov <
> >>> vozerov@gridgain.com
> >>> >>>>> > wrote:
> >>> >>>>>
> >>> >>>>>> Hi,
> >>> >>>>>>
> >>> >>>>>> I am not very concerned with null fields overhead,
because
> >>> usually it
> >>> >>>>>> won't be significant. However, there is a problem
with zeros.
> >>> User object
> >>> >>>>>> might have lots of int/long zeros, this is not
uncommon. And
> each
> >>> zero will
> >>> >>>>>> consume 4-8 additional bytes. We probably will
implement special
> >>> >>>>>> optimization which will write such fields in special
compact
> >>> format.
> >>> >>>>>>
> >>> >>>>>> Vladimir.
> >>> >>>>>>
> >>> >>>>>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko <
> >>> >>>>>> valentin.kulichenko@gmail.com> wrote:
> >>> >>>>>>
> >>> >>>>>>> Hi,
> >>> >>>>>>>
> >>> >>>>>>> Yes, null values consume memory. I believe
this can be
> optimized,
> >>> >>>>>>> but I
> >>> >>>>>>> haven't seen issues with this so far. Unless
you have hundreds
> of
> >>> >>>>>>> fields
> >>> >>>>>>> most of which are nulls (very rare case), the
overhead is
> >>> minimal.
> >>> >>>>>>>
> >>> >>>>>>> -Val
> >>> >>>>>>>
> >>> >>>>>>>
> >>> >>>>>>>
> >>> >>>>>>> --
> >>> >>>>>>> View this message in context: http://apache-ignite-users.705
> >>> >>>>>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html
> >>> >>>>>>> Sent from the Apache Ignite Users mailing list
archive at
> >>> Nabble.com.
> >>> >>>>>>>
> >>> >>>>>>
> >>> >>>>>>
> >>> >>>>>
> >>> >>>>
> >>> >>>
> >>> >
> >>>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message