ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Valentin Kulichenko <valentin.kuliche...@gmail.com>
Subject Re: BinaryObject pros/cons
Date Mon, 31 Oct 2016 17:46:23 GMT
Makes sense to me, but not sure about -1 in particular. Is this offset
relative to object start position? What values can it have?

-Val

On Mon, Oct 31, 2016 at 10:38 AM, Igor Sapego <isapego@gridgain.com> wrote:

> Vladimir,
>
> How about some reserved value? I.e -1 offset means a default/null value
> should be used?
>
> Best Regards,
> Igor
>
> On Mon, Oct 31, 2016 at 5:05 PM, Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
>
>> Valya,
>>
>> Do you have any ideas how to implement this? We write field offsets in the
>> footer. If field is not written, then what should be used for its offset?
>>
>> On Mon, Oct 31, 2016 at 4:56 PM, Valentin Kulichenko <
>> valentin.kulichenko@gmail.com> wrote:
>>
>> > Vladimir,
>> >
>> > These are good points, but I'm not suggesting to change the schema. If
>> one
>> > writes five fields, the schema should have five fields in any case,
>> > regardless of values. I only suggest to change the internal
>> representation
>> > of the object and do not save fields with default values in the byte
>> array
>> > as we don't really need them there.
>> >
>> > -Val
>> >
>> > On Sun, Oct 30, 2016 at 12:24 PM, Vladimir Ozerov <vozerov@gridgain.com
>> >
>> > wrote:
>> >
>> >> Valya,
>> >>
>> >> I have several concerns:
>> >> 1) Correctness: hasField() will not work properly. But probably we can
>> >> fix that by adding this info to schema.
>> >> 2) Performance: we have lots optimizations which depend on either
>> >> "stable" object schema, or low number of schemas. We will effectively
>> turn
>> >> them off.
>> >> But what concerns me even more, is that we may end up in enormous
>> number
>> >> of schemas. E.g. consider an object with 10 number fields. If all
>> fields
>> >> could be zero, we may end up in something like 2^10 schemas.
>> >>
>> >> Vladimir.
>> >>
>> >> 29 окт. 2016 г. 0:37 пользователь "Valentin Kulichenko"
<
>> >> valentin.kulichenko@gmail.com> написал:
>> >>
>> >> Vova,
>> >>>
>> >>> Why do we need to write zeros and nulls in the first place? What's the
>> >>> value of having them in the byte array?
>> >>>
>> >>> -Val
>> >>>
>> >>> On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov <
>> vozerov@gridgain.com>
>> >>> wrote:
>> >>>
>> >>>> Valya,
>> >>>>
>> >>>> Currently null value is written as one byte, while zero value of
long
>> >>>> type is written as 9 bytes. I want to improve that and write zeros
>> as one
>> >>>> byte as well.
>> >>>>
>> >>>> As per var-length encoding, I am strongly against it. It saves IO
and
>> >>>> memory at the cost of CPU. If we encode numbers in this way we will
>> >>>> slowdown SQL (which is already not very fast, to be honest). Because
>> >>>> instead of a single read memory read, we will have to perform
>> multiple
>> >>>> reads and then apply some mechanics to restore original value. We
>> already
>> >>>> have such problem with Strings - Java stores them as UTF-16, but
we
>> encode
>> >>>> them as UTF-8. As a result every read of a string field in SQL
>> results in
>> >>>> decoding overhead.
>> >>>>
>> >>>> Vladimir.
>> >>>>
>> >>>> On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko <
>> >>>> valentin.kulichenko@gmail.com> wrote:
>> >>>>
>> >>>>> Cross-posting this to dev list.
>> >>>>>
>> >>>>> Vladimir,
>> >>>>>
>> >>>>> To be honest, I don't see much difference between null values
for
>> >>>>> objects and zero values for primitives. From BinaryObject semantics
>> >>>>> standpoint, both are default values for corresponding types.
These
>> values
>> >>>>> will be returned from the BinaryObject.field() method regardless
of
>> whether
>> >>>>> we actually save then in the byte array or not. Having said
that,
>> why don't
>> >>>>> we just skip them during write?
>> >>>>>
>> >>>>> You optimization will be still useful though, because there
are
>> often
>> >>>>> a lot of ints and longs that are not zeros, but still small
and can
>> fit 1-2
>> >>>>> bytes. We already added such compaction in direct message
>> marshaling and it
>> >>>>> reduced overall traffic by around 30%.
>> >>>>>
>> >>>>> -Val
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov <
>> vozerov@gridgain.com
>> >>>>> > wrote:
>> >>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> I am not very concerned with null fields overhead, because
usually
>> it
>> >>>>>> won't be significant. However, there is a problem with zeros.
User
>> object
>> >>>>>> might have lots of int/long zeros, this is not uncommon.
And each
>> zero will
>> >>>>>> consume 4-8 additional bytes. We probably will implement
special
>> >>>>>> optimization which will write such fields in special compact
>> format.
>> >>>>>>
>> >>>>>> Vladimir.
>> >>>>>>
>> >>>>>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko <
>> >>>>>> valentin.kulichenko@gmail.com> wrote:
>> >>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> Yes, null values consume memory. I believe this can
be optimized,
>> >>>>>>> but I
>> >>>>>>> haven't seen issues with this so far. Unless you have
hundreds of
>> >>>>>>> fields
>> >>>>>>> most of which are nulls (very rare case), the overhead
is minimal.
>> >>>>>>>
>> >>>>>>> -Val
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> View this message in context: http://apache-ignite-users.705
>> >>>>>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html
>> >>>>>>> Sent from the Apache Ignite Users mailing list archive
at
>> Nabble.com.
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message