Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 513E3200CD3 for ; Fri, 28 Jul 2017 13:45:41 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4F72916CAA2; Fri, 28 Jul 2017 11:45:41 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6EE1316CA9F for ; Fri, 28 Jul 2017 13:45:40 +0200 (CEST) Received: (qmail 63887 invoked by uid 500); 28 Jul 2017 11:45:39 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 63875 invoked by uid 99); 28 Jul 2017 11:45:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Jul 2017 11:45:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 6172BC02BE for ; Fri, 28 Jul 2017 11:45:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.792 X-Spam-Level: *** X-Spam-Status: No, score=3.792 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URI_HEX=1.313] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gridgain-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id XJNYBuVsBlsD for ; Fri, 28 Jul 2017 11:45:35 +0000 (UTC) Received: from mail-ua0-f182.google.com (mail-ua0-f182.google.com [209.85.217.182]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 0575F5F2A9 for ; Fri, 28 Jul 2017 11:45:34 +0000 (UTC) Received: by mail-ua0-f182.google.com with SMTP id d29so152172852uai.2 for ; Fri, 28 Jul 2017 04:45:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gridgain-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=9kCDHi4/69awAhjS/GWu/OHzg32ouqBCj8Gp+SpSzcw=; b=LgzRQdsfjvHj+3SqbqldIxubHOAEo2vmDjtbiituCTqtBqCKJGUVFqjG+kVlM2peIF zGUL8pzLHWZ7Ppt+4Rpc6sYBJ1P4gYRGcXWLBtjL7ke5+vGawOYN6JXT9bviqcXpuwwu qeocOvYlncHaBASL838XNwn/8MxgkCezvBFmAngJ9zrhPyWE9Q1HnZNY1pGjzDDMw5gF sbFirhimgNl5Ak5rsiCSb/mVBYVlvq5Aa5clXyyA9W3HhQbkItkTpNz6dtmyqlIRKYvw Nco68nKhiI3UJVFFvUSBSEkos9FwCpxTLi2okmEfeFym15X/jud9cGLaeTZE52WkuHfA wPXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=9kCDHi4/69awAhjS/GWu/OHzg32ouqBCj8Gp+SpSzcw=; b=WraByKSri40YO9KYqpW9lNQ2uQPXBJO83tx5tHRiU7E7oZXBxYPRDDWEIVvuOdLmet DuYwmr6HOzX8hZtPOx8SzxyJOzyvz5S5Pz1hpEuW4D5tGrUsPeBCM7gvlCrcM//Lt5lk FsmIWOFoF51vN/3Jwcx9l8i8Zo+O7pqLR8L+TZjtTTD5HgNTFPvJ7AKRsgSHQV53BBhd uah5+L+Cr+vx8us5BC97v7STF5ezjTjyrof5aX1pLyVcqEqVEg444JVmPQ9lHDDrt9eI Cus/ZmP35r8c4A/NKRNwcM4/5V3DDwRsRIFJElD3qGLxr4utxtMPknwJxsNkCpDhrGk4 UZvQ== X-Gm-Message-State: AIVw113gnxkIza++UMZ0yoDLNLBOnVcMNLtijDvlMxe1lQUjtIKNCdaI jdsjuivBe32saGzkcGR0MmbqnhablItn X-Received: by 10.176.77.142 with SMTP id s14mr4386440uag.163.1501242334440; Fri, 28 Jul 2017 04:45:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.2.150 with HTTP; Fri, 28 Jul 2017 04:45:33 -0700 (PDT) In-Reply-To: References: From: Vladimir Ozerov Date: Fri, 28 Jul 2017 14:45:33 +0300 Message-ID: Subject: Re: Non-UTF-8 string encoding support in BinaryMarshaller (IGNITE-5655) To: dev@ignite.apache.org Content-Type: multipart/alternative; boundary="f4030437979011067c05555f3889" archived-at: Fri, 28 Jul 2017 11:45:41 -0000 --f4030437979011067c05555f3889 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable String encoding is a concept similar to "collation" in RDBMS. You can define it either globally, or on per-table basis. The same should be done for Ignite. We do not define behavior of a type. We define behavior of a *storage*. Two cases when proposed approach with per-type and per-type-field approach doesn't work: 1) I have a class Person with field "name". I have two caches/tables - one for US persons, where name is in Latin, another for RU persons with Cyrillic names. How can achieve optimal encoding formats for both tables? 2) I have an empty grid. Now I want to create a cache/table with custom encoding. How can I do that without cluster restart? Nohow, because BinaryTypeConfiguration configured statically, while caches/tables can be created in runtime. On Fri, Jul 28, 2017 at 2:38 PM, Pavel Tupitsyn wrote: > > As Pavel mentioned, Marshaller should not be tied to cache > > should be added to per-cache level > Not sure if I follow. > Marshalling and caching are two separate mechanisms. > Defining binary format in CacheConfiguration violates separation of > concerns. > > > Encoding *must not* be added to per-class or per-field level, this is > wrong > What is wrong with this? BinaryTypeConfiguration looks the right place fo= r > such a setting. > Are we talking from SQL standpoint here, so you want this to be defined > somehow via DDL in future? > > On Fri, Jul 28, 2017 at 2:30 PM, Vladimir Ozerov > wrote: > > > Encoding *must not* be added to per-class or per-field level, this is > > wrong. > > > > It should be added to per-cache level, and to per-cache-column level in > > future. > > > > =D0=BF=D1=82, 28 =D0=B8=D1=8E=D0=BB=D1=8F 2017 =D0=B3. =D0=B2 14:27, An= drey Kuznetsov : > > > > > We discussed this with Pavel and Anton just a moment ago. Summary > > follows. > > > > > > - New byte "flag" is to be added (ENCODED_STRING) > > > - 'Encoding' property is to be added at > > > -- global level (BinaryConfiguration) > > > -- per-class level (BinaryTypeConfiguration) > > > -- per-field level (BinaryTypeConfiguration) > > > > > > 2017-07-28 14:15 GMT+03:00 Vladimir Ozerov [via Apache Ignite > > Developers] < > > > ml+s2346864n20159h78@n4.nabble.com>: > > > > > > > As Pavel mentioned, Marshaller should not be tied to cache, > > BinaryObject > > > > should be self-explanatory, i.e. containing all information necessa= ry > > for > > > > unmarshalling. This is an absolute requirement. > > > > > > > > We will have one extra byte for in serialized form, meaning that > > > advantage > > > > of custom encoding will become evident for all strings with length = >=3D > > 1, > > > > which is perfectly fine. I do not quite understand what are we > arguing > > > > about. > > > > > > > > As far as configuration, we can do it as follows: > > > > > > > > 1) Add global encoding, UTF8 by default. > > > > 2) Add per-cache encoding. > > > > 3) Add encoding to JDBC and ODBC driver properties. > > > > > > > > This should be enough. > > > > > > > > > > > -- > > > Best regards, > > > Andrey Kuznetsov. > > > > > > > > > > > > > > > -- > > > View this message in context: > > > http://apache-ignite-developers.2346864.n4.nabble. > > com/Non-UTF-8-string-encoding-support-in-BinaryMarshaller- > > IGNITE-5655-tp20024p20161.html > > > Sent from the Apache Ignite Developers mailing list archive at > > Nabble.com. > > > --f4030437979011067c05555f3889--