ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: Non-UTF-8 string encoding support in BinaryMarshaller (IGNITE-5655)
Date Fri, 28 Jul 2017 11:45:33 GMT
String encoding is a concept similar to "collation" in RDBMS. You can
define it either globally, or on per-table basis. The same should be done
for Ignite. We do not define behavior of a type. We define behavior of a
*storage*.

Two cases when proposed approach with per-type and per-type-field approach
doesn't work:
1) I have a class Person with field "name". I have two caches/tables - one
for US persons, where name is in Latin, another for RU persons with
Cyrillic names. How can achieve optimal encoding formats for both tables?
2) I have an empty grid. Now I want to create a cache/table with custom
encoding. How can I do that without cluster restart? Nohow, because
BinaryTypeConfiguration configured statically, while caches/tables can be
created in runtime.

On Fri, Jul 28, 2017 at 2:38 PM, Pavel Tupitsyn <ptupitsyn@apache.org>
wrote:

> > As Pavel mentioned, Marshaller should not be tied to cache
> > should be added to per-cache level
> Not sure if I follow.
> Marshalling and caching are two separate mechanisms.
> Defining binary format in CacheConfiguration violates separation of
> concerns.
>
> > Encoding *must not* be added to per-class or per-field level, this is
> wrong
> What is wrong with this? BinaryTypeConfiguration looks the right place for
> such a setting.
> Are we talking from SQL standpoint here, so you want this to be defined
> somehow via DDL in future?
>
> On Fri, Jul 28, 2017 at 2:30 PM, Vladimir Ozerov <vozerov@gridgain.com>
> wrote:
>
> > Encoding *must not* be added to per-class or per-field level, this is
> > wrong.
> >
> > It should be added to per-cache level, and to per-cache-column level in
> > future.
> >
> > пт, 28 июля 2017 г. в 14:27, Andrey Kuznetsov <stkuzma@gmail.com>:
> >
> > > We discussed this with Pavel and Anton just a moment ago. Summary
> > follows.
> > >
> > > - New byte "flag" is to be added (ENCODED_STRING)
> > > - 'Encoding' property is to be added at
> > >   -- global level (BinaryConfiguration)
> > >   -- per-class level (BinaryTypeConfiguration)
> > >   -- per-field level (BinaryTypeConfiguration)
> > >
> > > 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite
> > Developers] <
> > > ml+s2346864n20159h78@n4.nabble.com>:
> > >
> > > > As Pavel mentioned, Marshaller should not be tied to cache,
> > BinaryObject
> > > > should be self-explanatory, i.e. containing all information necessary
> > for
> > > > unmarshalling. This is an absolute requirement.
> > > >
> > > > We will have one extra byte for in serialized form, meaning that
> > > advantage
> > > > of custom encoding will become evident for all strings with length >=
> > 1,
> > > > which is perfectly fine. I do not quite understand what are we
> arguing
> > > > about.
> > > >
> > > > As far as configuration, we can do it as follows:
> > > >
> > > > 1) Add global encoding, UTF8 by default.
> > > > 2) Add per-cache encoding.
> > > > 3) Add encoding to JDBC and ODBC driver properties.
> > > >
> > > > This should be enough.
> > > >
> > > >
> > > --
> > > Best regards,
> > >   Andrey Kuznetsov.
> > >
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > > http://apache-ignite-developers.2346864.n4.nabble.
> > com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller-
> > IGNITE-5655-tp20024p20161.html
> > > Sent from the Apache Ignite Developers mailing list archive at
> > Nabble.com.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message