ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: [IMPORTANT] Future of Binary Objects
Date Tue, 20 Nov 2018 07:56:32 GMT
Hi Alexey,

Binary Objects only.

On Tue, Nov 20, 2018 at 10:50 AM Alexey Zinoviev <zaleslaw.sin@gmail.com>
wrote:

> Do we discuss here Core features only or the roadmap for all components?
>
> вт, 20 нояб. 2018 г. в 10:05, Vladimir Ozerov <vozerov@gridgain.com>:
>
> > Igniters,
> >
> > It is very likely that Apache Ignite 3.0 will be released next year. So
> we
> > need to start thinking about major product improvements. I'd like to
> start
> > with binary objects.
> >
> > Currently they are one of the main limiting factors for the product. They
> > are fat - 30+ bytes overhead on average, high TCO of Apache Ignite
> > comparing to other vendors. They are slow - not suitable for SQL at all.
> >
> > I would like to ask all of you who worked with binary objects to share
> your
> > feedback and ideas, so that we understand how they should look like in AI
> > 3.0. This is a brain storm - let's accumulate ideas first and minimize
> > critics. Then we will work on ideas in separate topics.
> >
> > 1) Historical background
> >
> > BO were implemented around 2014 (Apache Ignite 1.5) when we started
> working
> > on .NET and CPP clients. During design we had several ideas in mind:
> > - ability to read object fields in O(1) without deserialization
> > - interoperabillty between Java, .NET and CPP.
> >
> > Since then a number of other concepts were mixed to the cocktail:
> > - Affinity key fields
> > - Strict typing for existing fields (aka metadata)
> > - Binary Object as storage format
> >
> > 2) My proposals
> >
> > 2.1) Introduce "Data Row Format" interface
> > Binary Objects are terrible candidates for storage. Too fat, too slow.
> > Efficient storage typically has <10 bytes overhead per row (no metadata,
> no
> > length, no hash code, etc), allow supper-fast field access, support
> > different string formats (ASCII, UTF-8, etc), support different temporal
> > types (date, time, timestamp, timestamp with timezone, etc), and store
> > these types as efficiently as possible.
> >
> > What we need is to introduce an interface which will convert a pair of
> > key-value objects into a row. This row will be used to store data and to
> > get fields from it. Care about memory consumption, need SQL and strict
> > schema - use one format. Need flexibility and prefer key-value access -
> use
> > another format which will store binary objects unchanged (current
> > behavior).
> >
> > interface DataRowFormat {
> >     DataRow create(Object key, Object value); // primitives or binary
> > objects
> >     DataRowMetadata metadata();
> > }
> >
> > 2.2) Remove affinity field from metadata
> > Affinity rules are governed by cache, not type. We should remove
> > "affintiyFieldName" from metadata.
> >
> > 2.3) Remove restrictions on changing field type
> > I do not know why we did that in the first place. This restriction
> prevents
> > type evolution and confuses users.
> >
> > 2.4) Use bitmaps for "null" and default values and for fixed-length
> fields,
> > put fixed-length fields before variable-length.
> > Motivation: to save space.
> >
> > What else? Please share your ideas.
> >
> > Vladimir.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message